0% found this document useful (0 votes)
411 views53 pages

Pulse-Code Modulation (PCM) Is A Method Used To

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. In PCM, an analog signal is sampled regularly at uniform intervals and each sample is quantized to the nearest value within a range of digital steps, producing a discrete digital representation. The fidelity of the digital representation depends on the sampling rate and bit depth. PCM is the standard form for digital audio and is used in digital telephone systems and formats like CDs, DVDs, and Blu-ray discs.

Uploaded by

sugu2k88983
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
411 views53 pages

Pulse-Code Modulation (PCM) Is A Method Used To

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. In PCM, an analog signal is sampled regularly at uniform intervals and each sample is quantized to the nearest value within a range of digital steps, producing a discrete digital representation. The fidelity of the digital representation depends on the sampling rate and bit depth. PCM is the standard form for digital audio and is used in digital telephone systems and formats like CDs, DVDs, and Blu-ray discs.

Uploaded by

sugu2k88983
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 53

Pulse-code modulation

From Wikipedia, the free encyclopedia


Jump to: navigation, search
"PCM" redirects here. For other uses, see PCM (disambiguation).

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog


signals, which was invented by Alec Reeves in 1937. It is the standard form for digital
audio in computers and various Blu-ray, Compact Disc and DVD formats, as well as
other uses such as digital telephone systems. A PCM stream is a digital representation of
an analog signal, in which the magnitude of the analogue signal is sampled regularly at
uniform intervals, with each sample being quantized to the nearest value within a range
of digital steps.

PCM streams have two basic properties that determine their fidelity to the original analog
signal: the sampling rate, which is the number of times per second that samples are taken;
and the bit depth, which determines the number of possible digital values that each
sample can take.

Contents
[hide]

• 1 Modulation
• 2 Demodulation
• 3 Limitations
• 4 Digitization as part of the PCM process
• 5 Encoding for transmission
• 6 History
• 7 Nomenclature
• 8 See also
• 9 References
• 10 Further reading

• 11 External links

[edit] Modulation
Sampling and quantization of a signal (red) for 4-bit PCM

In the diagram, a sine wave (red curve) is sampled and quantized for pulse code
modulation. The sine wave is sampled at regular intervals, shown as ticks on the x-axis.
For each sample, one of the available values (ticks on the y-axis) is chosen by some
algorithm. This produces a fully discrete representation of the input signal (shaded area)
that can be easily encoded as digital data for storage or manipulation. For the sine wave
example at right, we can verify that the quantized values at the sampling moments are 7,
9, 11, 12, 13, 14, 14, 15, 15, 15, 14, etc. Encoding these values as binary numbers would
result in the following set of nibbles: 0111 (23×0+22×1+21×1+20×1=0+4+2+1=7), 1001,
1011, 1100, 1101, 1110, 1110, 1111, 1111, 1111, 1110, etc. These digital values could
then be further processed or analyzed by a purpose-specific digital signal processor or
general purpose DSP. Several Pulse Code Modulation streams could also be multiplexed
into a larger aggregate data stream, generally for transmission of multiple streams over a
single physical link. One technique is called time-division multiplexing, or TDM, and is
widely used, notably in the modern public telephone system. Another technique is called
Frequency-division multiplexing, where the signal is assigned a frequency in a spectrum,
and transmitted along with other signals inside that spectrum. Currently, TDM is much
more widely used than FDM because of its natural compatibility with digital
communication, and generally lower bandwidth requirements.

There are many ways to implement a real device that performs this task. In real systems,
such a device is commonly implemented on a single integrated circuit that lacks only the
clock necessary for sampling, and is generally referred to as an ADC (Analog-to-Digital
converter). These devices will produce on their output a binary representation of the input
whenever they are triggered by a clock signal, which would then be read by a processor
of some sort.

[edit] Demodulation
To produce output from the sampled data, the procedure of modulation is applied in
reverse. After each sampling period has passed, the next value is read and a signal is
shifted to the new value. As a result of these transitions, the signal will have a significant
amount of high-frequency energy. To smooth out the signal and remove these undesirable
aliasing frequencies, the signal would be passed through analog filters that suppress
energy outside the expected frequency range (that is, greater than the Nyquist frequency
fs / 2). Some systems use digital filtering to remove some of the aliasing, converting the
signal from digital to analog at a higher sample rate such that the analog filter required
for anti-aliasing is much simpler. In some systems, no explicit filtering is done at all; as
it's impossible for any system to reproduce a signal with infinite bandwidth, inherent
losses in the system compensate for the artifacts — or the system simply does not require
much precision. The sampling theorem suggests that practical PCM devices, provided a
sampling frequency that is sufficiently greater than that of the input signal, can operate
without introducing significant distortions within their designed frequency bands.

The electronics involved in producing an accurate analog signal from the discrete data are
similar to those used for generating the digital signal. These devices are DACs (digital-to-
analog converters), and operate similarly to ADCs. They produce on their output a
voltage or current (depending on type) that represents the value presented on their inputs.
This output would then generally be filtered and amplified for use.

[edit] Limitations
There are two sources of impairment implicit in any PCM system:

• Choosing a discrete value near the analog signal for each sample leads to
quantization error, which swings between -q/2 and q/2. In the ideal case (with a
fully linear ADC) it is uniformly distributed over this interval, with zero mean and
variance of q2/12.
• Between samples no measurement of the signal is made; the sampling theorem
guarantees non-ambiguous representation and recovery of the signal only if it has
no energy at frequency fs/2 or higher (one half the sampling frequency, known as
the Nyquist frequency); higher frequencies will generally not be correctly
represented or recovered.

As samples are dependent on time, an accurate clock is required for accurate


reproduction. If either the encoding or decoding clock is not stable, its frequency drift
will directly affect the output quality of the device. A slight difference between the
encoding and decoding clock frequencies is not generally a major concern; a small
constant error is not noticeable. Clock error does become a major issue if the clock is not
stable, however. A drifting clock, even with a relatively small error, will cause very
obvious distortions in audio and video signals, for example.

Extra information: PCM data from a master with a clock frequency that can not be
influenced requires an exact clock at the decoding side to ensure that all the data is used
in a continuous stream without buffer underrun or buffer overflow. Any frequency
difference will be audible at the output since the number of samples per time interval can
not be correct. The data speed in a compact disk can be steered by means of a servo that
controls the rotation speed of the disk; here the output clock is the master clock. For all
"external master" systems like DAB the output stream must be decoded with a
regenerated and exact synchronous clock. When the wanted output sample rate differs
from the incoming data stream clock then a sample rate converter must be inserted in the
chain to convert the samples to the new clock domain.

[edit] Digitization as part of the PCM process


In conventional PCM, the analog signal may be processed (e.g., by amplitude
compression) before being digitized. Once the signal is digitized, the PCM signal is
usually subjected to further processing (e.g., digital data compression).

PCM with linear quantization is known as Linear PCM (LPCM).[1]

Some forms of PCM combine signal processing with coding. Older versions of these
systems applied the processing in the analog domain as part of the A/D process; newer
implementations do so in the digital domain. These simple techniques have been largely
rendered obsolete by modern transform-based audio compression techniques.

• DPCM encodes the PCM values as differences between the current and the
predicted value. An algorithm predicts the next sample based on the previous
samples, and the encoder stores only the difference between this prediction and
the actual value. If the prediction is reasonable, fewer bits can be used to represent
the same information. For audio, this type of encoding reduces the number of bits
required per sample by about 25% compared to PCM.
• Adaptive DPCM (ADPCM) is a variant of DPCM that varies the size of the
quantization step, to allow further reduction of the required bandwidth for a given
signal-to-noise ratio.
• Delta modulation is a form of DPCM which uses one bit per sample.

In telephony, a standard audio signal for a single phone call is encoded as 8,000 analog
samples per second, of 8 bits each, giving a 64 kbit/s digital signal known as DS0. The
default signal compression encoding on a DS0 is either μ-law (mu-law) PCM (North
America and Japan) or A-law PCM (Europe and most of the rest of the world). These are
logarithmic compression systems where a 12 or 13-bit linear PCM sample number is
mapped into an 8-bit value. This system is described by international standard G.711. An
alternative proposal for a floating point representation, with 5-bit mantissa and 3-bit
radix, was abandoned.

Where circuit costs are high and loss of voice quality is acceptable, it sometimes makes
sense to compress the voice signal even further. An ADPCM algorithm is used to map a
series of 8-bit µ-law or A-law PCM samples into a series of 4-bit ADPCM samples. In
this way, the capacity of the line is doubled. The technique is detailed in the G.726
standard.

Later it was found that even further compression was possible and additional standards
were published. Some of these international standards describe systems and ideas which
are covered by privately owned patents and thus use of these standards requires payments
to the patent holders.
Some ADPCM techniques are used in Voice over IP communications.

[edit] Encoding for transmission


Main article: Line code

Pulse-code modulation can be either return-to-zero (RZ) or non-return-to-zero (NRZ).


For a NRZ system to be synchronized using in-band information, there must not be long
sequences of identical symbols, such as ones or zeroes. For binary PCM systems, the
density of 1-symbols is called ones-density.[2]

Ones-density is often controlled using precoding techniques such as Run Length Limited
encoding, where the PCM code is expanded into a slightly longer code with a guaranteed
bound on ones-density before modulation into the channel. In other cases, extra framing
bits are added into the stream which guarantee at least occasional symbol transitions.

Another technique used to control ones-density is the use of a scrambler polynomial on


the raw data which will tend to turn the raw data stream into a stream that looks pseudo-
random, but where the raw stream can be recovered exactly by reversing the effect of the
polynomial. In this case, long runs of zeroes or ones are still possible on the output, but
are considered unlikely enough to be within normal engineering tolerance.

In other cases, the long term DC value of the modulated signal is important, as building
up a DC offset will tend to bias detector circuits out of their operating range. In this case
special measures are taken to keep a count of the cumulative DC offset, and to modify the
codes if necessary to make the DC offset always tend back to zero.

Many of these codes are bipolar codes, where the pulses can be positive, negative or
absent. In the typical alternate mark inversion code, non-zero pulses alternate between
being positive and negative. These rules may be violated to generate special symbols
used for framing or other special purposes.

See also: T-carrier and E-carrier

[edit] History
In the history of electrical communications, the earliest reason for sampling a signal was
to interlace samples from different telegraphy sources, and convey them over a single
telegraph cable. Telegraph time-division multiplexing (TDM) was conveyed as early as
1853, by the American inventor Moses B. Farmer. The electrical engineer W. M. Miner,
in 1903, used an electro-mechanical commutator for time-division multiplex of multiple
telegraph signals, and also applied this technology to telephony. He obtained intelligible
speech from channels sampled at a rate above 3500–4300 Hz: below this was
unsatisfactory. This was TDM, but pulse-amplitude modulation (PAM) rather than PCM.
In 1926, Paul M. Rainey of Western Electric patented a facsimile machine which
transmitted its signal using 5-bit PCM, encoded by an opto-mechanical analog-to-digital
converter.[3] The machine did not go into production. British engineer Alec Reeves,
unaware of previous work, conceived the use of PCM for voice communication in 1937
while working for International Telephone and Telegraph in France. He described the
theory and advantages, but no practical use resulted. Reeves filed for a French patent in
1938, and his U.S. patent was granted in 1943.

The first transmission of speech by digital techniques was the SIGSALY vocoder
encryption equipment used for high-level Allied communications during World War II
from 1943. In 1943, the Bell Labs researchers who designed the SIGSALY system
became aware of the use of PCM binary coding as already proposed by Alec Reeves. In
1949 for the Canadian Navy's DATAR system, Ferranti Canada built a working PCM
radio system that was able to transmit digitized radar data over long distances.[4]

PCM in the late 1940s and early 1950s used a cathode-ray coding tube with a plate
electrode having encoding perforations.[5][6] As in an oscilloscope, the beam was swept
horizontally at the sample rate while the vertical deflection was controlled by the input
analog signal, causing the beam to pass through higher or lower portions of the perforated
plate. The plate collected or passed the beam, producing current variations in binary code,
one bit at a time. Rather than natural binary, the grid of Goodall's later tube was
perforated to produce a glitch-free Gray code, and produced all bits simultaneously by
using a fan beam instead of a scanning beam.

The National Inventors Hall of Fame has honored Bernard M. Oliver[7] and Claude
Shannon[8] as the inventors of PCM,[9] as described in 'Communication System
Employing Pulse Code Modulation,' U.S. Patent 2,801,281 filed in 1946 and 1952,
granted in 1956. Another patent by the same title was filed by John R. Pierce in 1945, and
issued in 1948: U.S. Patent 2,437,707. The three of them published "The Philosophy of
PCM" in 1948.[10]

Pulse-code modulation (PCM) was used in Japan by Denon in 1972 for the mastering and
production of analogue phonograph records, using a 2-inch Quadruplex-format videotape
recorder for its transport, but this was not developed into a consumer product.

[edit] Nomenclature
The word pulse in the term Pulse-Code Modulation refers to the "pulses" to be found in
the transmission line. This perhaps is a natural consequence of this technique having
evolved alongside two analog methods, pulse width modulation and pulse position
modulation, in which the information to be encoded is in fact represented by discrete
signal pulses of varying width or position, respectively. In this respect, PCM bears little
resemblance to these other forms of signal encoding, except that all can be used in time
division multiplexing, and the binary numbers of the PCM codes are represented as
electrical pulses. The device that performs the coding and decoding function in a
telephone circuit is called a codec.
Companding
From Wikipedia, the free encyclopedia
Jump to: navigation, search

Original signal

After compressing, before expanding

In telecommunication, signal processing, and thermodynamics, companding


(occasionally called compansion) is a method of mitigating the detrimental effects of a
channel with limited dynamic range. The name is a portmanteau of compressing and
expanding.

While the compression used in audio recording and the like depends on a variable-gain
amplifier, and so is a locally linear process (linear for short regions, but not globally),
companding is non-linear and takes place in the same way at all points in time. The
dynamic range of a signal is compressed before transmission and is expanded to the
original value at the receiver.

The electronic circuit that does this is called a compandor and works by compressing or
expanding the dynamic range of an analog electronic signal such as sound. One variety is
a triplet of amplifiers: a logarithmic amplifier, followed by a variable-gain linear
amplifier and an exponential amplifier. Such a triplet has the property that its output
voltage is proportional to the input voltage raised to an adjustable power. Compandors
are used in concert audio systems and in some noise reduction schemes such as dbx and
Dolby NR (all versions).

Companding can also refer to the use of compression, where gain is decreased when
levels rise above a certain threshold, and its complement, expansion, where gain is
increased when levels drop below a certain threshold.

The use of companding allows signals with a large dynamic range to be transmitted over
facilities that have a smaller dynamic range capability. For example, it is employed in
professional wireless microphones since the dynamic range of the microphone audio
signal itself is larger than the dynamic range provided by radio transmission.
Companding also reduces the noise and crosstalk levels at the receiver.

Companding is used in digital and telephony systems , compressing before input to an


analog-to-digital converter, and then expanding after a digital-to-analog converter. This is
equivalent to using a non-linear ADC as in a T-carrier telephone system that implements
A-law or μ-law companding. This method is also used in digital file formats for better
signal-to-noise ratio (SNR) at lower bit rates. For example, a linearly encoded 16-bit
PCM signal can be converted to an 8-bit WAV or AU file while maintaining a decent
SNR by compressing before the transition to 8-bit and expanding after a conversion back
to 16-bit. This is effectively a form of lossy audio data compression.

Many of the music equipment manufacturers (Roland, Yamaha, Korg) used companding
for data compression in their digital synthesizers. This dates back to the late 1980s when
memory chips would often come as one the most costly parts in the instrument.
Manufacturers usually express the amount of memory as it is in the compressed form. i.e.
24MB waveform ROM in Korg Trinity is actually 48MB of data. Still the fact remains
that the unit has a 24MB physical ROM. In the example of Roland SR-JV expansion
boards, they usually advertised them as 8MB boards which contain '16MB-equivalent
content'. Careless copying of the info and omitting the part that stated "equivalent" can
often lead to confusion.

Contents
[hide]

• 1 History
• 2 See also
• 3 References

• 4 External links

[edit] History
The use of companding in an analog picture transmission system was patented by A. B.
Clark of AT&T in 1928 (filed in 1925):[1]
In the transmission of pictures by electric currents, the method which consists in sending
currents varied in a non-linear relation to the light values of the successive elements of
the picture to be transmitted, and at the receiving end exposing corresponding elements of
a sensitive surface to light varied in inverse non-linear relation to the received current.
—A. B. Clark patent

In 1942, Clark and his team completed the SIGSALY secure voice transmission system
that included the first use of companding in a PCM (digital) system.[2]

In 1953, B. Smith showed that a nonlinear DAC could result in the inverse nonlinearity in
a successive-approximation ADC configuration, simplifying the design of digital
companding systems.[3]

In 1970, H. Kaneko developed the uniform description of segment (piecewise linear)


companding laws that had by then been adopted in digital telephony.[4]

μ-law algorithm
From Wikipedia, the free encyclopedia
Jump to: navigation, search

Companding of μ-law and A-law algorithms

The µ-law algorithm (often u-law, ulaw, mu-law, pronounced /ˈmjuː/) is a companding
algorithm, primarily used in the digital telecommunication systems of North America and
Japan. Companding algorithms reduce the dynamic range of an audio signal. In analog
systems, this can increase the signal-to-noise ratio (SNR) achieved during transmission,
and in the digital domain, it can reduce the quantization error (hence increasing signal to
quantization noise ratio). These SNR increases can be traded instead for reduced
bandwidth for equivalent SNR.
It is similar to the A-law algorithm used in regions where digital telecommunication
signals are carried on E-1 circuits, e.g. Europe.

Contents
[hide]

• 1 Algorithm Types
o 1.1 Continuous
o 1.2 Discrete
• 2 Implementation
• 3 Usage Justification
• 4 Comparison with A-law
• 5 See also
• 6 References

• 7 External links

[edit] Algorithm Types


There are two forms of this algorithm - an analog version, and a quantized digital version.

[edit] Continuous

For a given input x, the equation for μ-law encoding is[1]

where μ = 255 (8 bits) in the North American and Japanese standards. It is important to
note that the range of this function is -1 to 1.

μ-law expansion is then given by the inverse equation:

The equations are culled from Cisco's Waveform Coding Techniques.

[edit] Discrete

This is defined in ITU-T Recommendation G.711 [2]

G.711 is unclear about what the values at the limit of a range code up as. (e.g. whether
+31 codes to 0xEF or 0xF0). However G.191 provides example C code for a μ-law
encoder which gives the following encoding. Note the difference between the positive
and negative ranges. e.g. the negative range corresponding to +30 to +1 is -31 to -2. This
is accounted for by the use of a 1's complement (simple bit inversion) rather than 2's
complement to convert a negative value to a positive value during encoding.

Quantized μ-law algorithm


14 bit Binary Linear input code 8 bit Compressed code
+8158 to +4063 in 16 intervals of 256 0x80 + interval number
+4062 to +2015 in 16 intervals of 128 0x90 + interval number
+2014 to +991 in 16 intervals of 64 0xA0 + interval number
+990 to +479 in 16 intervals of 32 0xB0 + interval number
+478 to +223 in 16 intervals of 16 0xC0 + interval number
+222 to +95 in 16 intervals of 8 0xD0 + interval number
+94 to +31 in 16 intervals of 4 0xE0 + interval number
+30 to +1 in 15 intervals of 2 0xF0 + interval number
0 0xFF
-1 0x7F
-31 to -2 in 15 intervals of 2 0x70 + interval number
-95 to -32 in 16 intervals of 4 0x60 + interval number
-223 to -96 in 16 intervals of 8 0x50 + interval number
-479 to -224 in 16 intervals of 16 0x40 + interval number
-991 to -480 in 16 intervals of 32 0x30 + interval number
-2015 to -992 in 16 intervals of 64 0x20 + interval number
-4063 to -2016 in 16 intervals of 128 0x10 + interval number
-8159 to -4064 in 16 intervals of 256 0x00 + interval number

[edit] Implementation
There are three ways of implementing a μ-law algorithm :

Analog
Use an amplifier with non-linear gain to achieve companding entirely in the
analog domain.
Non-linear ADC
Use an Analog to Digital Converter with quantization levels which are unequally
spaced to match the μ-law algorithm.
Digital
Use the quantized digital version of the μ-law algorithm to convert data once it is
in the digital domain.

[edit] Usage Justification


This encoding is used because speech has a wide dynamic range. In the analog world,
when mixed with a relatively constant background noise source, the finer detail is lost.
Given that the precision of the detail is compromised anyway, and assuming that the
signal is to be perceived as audio by a human, one can take advantage of the fact that
perceived intensity (loudness) is logarithmic[3] by compressing the signal using a
logarithmic-response op-amp. In telco circuits, most of the noise is injected on the lines,
thus after the compressor, the intended signal will be perceived as significantly louder
than the static, compared to an un-compressed source. This became a common telco
solution, and thus, prior to common digital usage, the μ-law specification was developed
to define an inter-compatible standard.

As the digital age dawned, it was noted that this pre-existing algorithm had the effect of
significantly reducing the number of bits needed to encode recognizable human voice.
Using μ-law, a sample could be effectively encoded in as few as 8 bits, a sample size that
conveniently matched the symbol size of most standard computers.

μ-law encoding effectively reduced the dynamic range of the signal, thereby increasing
the coding efficiency while biasing the signal in a way that results in a signal-to-
distortion ratio that is greater than that obtained by linear encoding for a given number of
bits. This is an early form of perceptual audio encoding.

The μ-law algorithm is also used in the .au format, which dates back at least to the
SPARCstation 1 as the native method used by Sun's /dev/audio interface, widely used as
a de facto standard for Unix sound. The .au format is also used in various common audio
APIs such as the classes in the sun.audio Java package in Java 1.1 and in some C#
methods.

This plot illustrates how μ-law concentrates sampling in the smaller (softer) values. The
values of a μ-law byte 0-255 are the horizontal axis, the vertical axis is the 16 bit linear
decoded value. This image was generated with the Sun Microsystems c routine g711.c
commonly available on the Internet.
[edit] Comparison with A-law
The µ-law algorithm provides a slightly larger dynamic range than the A-law at the cost
of worse proportional distortion for small signals. By convention, A-law is used for an
international connection if at least one country uses it.

A-law algorithm
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Graph of μ-law & A-law algorithms

An A-law algorithm is a standard companding algorithm, used in European digital


communications systems to optimize, i.e., modify, the dynamic range of an analog signal
for digitizing.

It is similar to the μ-law algorithm used in North America and Japan.

For a given input x, the equation for A-law encoding is as follows,

where A is the compression parameter. In Europe, A = 87.7; the value 87.6 is also used.

A-law expansion is given by the inverse function,

The reason for this encoding is that the wide dynamic range of speech does not lend itself
well to efficient linear digital encoding. A-law encoding effectively reduces the dynamic
range of the signal, thereby increasing the coding efficiency and resulting in a signal-to-
distortion ratio that is superior to that obtained by linear encoding for a given number of
bits.

[edit] Comparison to μ-law


The μ-law algorithm provides a slightly larger dynamic range than the A-law at the cost
of worse proportional distortion for small signals. By convention, A-law is used for an
international connection if at least one country uses it.

Narrowband
From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article does not cite any references or sources.
Please help improve this article by adding citations to reliable sources. Unsourced material may
be challenged and removed. (March 2009)

Narrowband refers to a situation in radio communications where the bandwidth of the


message does not significantly exceed the channel's coherence bandwidth. It is a common
misconception that narrowband refers to a channel which occupies only a "small" amount
of space on the radio spectrum.

The opposite of narrowband is wideband.

In the study of wireless channels, narrowband implies that the channel under
consideration is sufficiently narrow that its frequency response can be considered flat.
The message bandwidth will therefore be less than the coherence bandwidth of the
channel. This is usually used as an idealizing assumption; no channel has perfectly flat
fading, but the analysis of many aspects of wireless systems is greatly simplified if flat
fading can be assumed.

Narrowband can also be used with the audio spectrum to describe sounds which occupy a
narrow range of frequencies.

In telephony, narrowband is usually considered to cover frequencies 300–3400 Hz.

Wideband
From Wikipedia, the free encyclopedia
Jump to: navigation, search
For the automotive term, see Wideband (automotive).

In communications, wideband is a relative term used to describe a wide range of


frequencies in a spectrum. A system is typically described as wideband if the message
bandwidth significantly exceeds the channel's coherence bandwidth. Some
communication links have such a high data rate that they are forced to use a wideband
bandwidth; others links may have relatively low data rates, but deliberately use a wider
bandwidth than "necessary" for that data rate in order to gain other advantages; see
spread spectrum.
A wideband antenna is one with approximately or exactly the same operating
characteristics over a very wide passband. Distinguished from broadband antennas, where
the passband is large, but the gain and/or pattern need not stay the same over the
passband.

The term Wideband Audio or (also termed HD Voice or Wideband Voice) denotes a
telephone conversation using a wideband codec, which uses a greater frequency range of
the audio spectrum than conventional telephone calls, resulting in a clearer sound.

According to the United States Patent and Trademark Office, WIDEBAND is a registered
trademark [1] of WideBand Corporation, a USA based manufacturer of Gigabit Ethernet
managed switches, adapters, and networking equipment. [2]

In some contexts wideband is distinguished from broadband in being broader. [3]

Quantization (signal processing)


From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article needs additional citations for verification.
Please help improve this article by adding reliable references. Unsourced material may be
challenged and removed. (March 2010)

Sampled signal (discrete signal): discrete time, continuous values.

Quantized signal: continuous time, discrete values.


Digital signal (sampled, quantized): discrete time, discrete values.

In digital signal processing, quantization is the process of approximating ("mapping") a


continuous range of values (or a very large set of possible discrete values) by a relatively
small ("finite") set of ("values which can still take on continuous range") discrete
symbols or integer values. For example, rounding a real number in the interval [0,100] to
an integer

In other words, quantization can be described as a mapping that represents a finite


continuous interval I = [a,b] of the range of a continuous valued signal, with a single
number c, which is also on that interval. For example, rounding to the nearest integer
(rounding ½ up) replaces the interval [c − .5,c + .5) with the number c, for integer c.
After that quantization we produce a finite set of values which can be encoded by binary
techniques for example.

In signal processing, quantization refers to approximating the output by one of a discrete


and finite set of values, while replacing the input by a discrete set is called discretization,
and is done by sampling: the resulting sampled signal is called a discrete signal (discrete
time), and need not be quantized (it can have continuous values). To produce a digital
signal (discrete time and discrete values), one both samples (discrete time) and quantizes
the resulting sample values (discrete values).[1][2]

Contents
[hide]

• 1 Quantization Noise
• 2 Applications
• 3 See also
• 4 External links
• 5 Notes

• 6 References

[edit] Quantization Noise


When a continuous signal is discretized the difference between the continuous signal and
the discretized signal is an error. Strictly speaking this error is distortion since the same
signal discretized repeatedly results in the same error. If a periodic signal like a sine wave
is synchronously sampled and discretized then the discretized signal will exhibit
harmonic distortion. However, even though it is actually distortion, it can be analyzed as
noise. If the discretization is uniform and the width of the discretization interval is ,

then the noise power, n, is .[3]

[edit] Applications
This section may require copy-editing.

In electronics, adaptive quantization is a quantization process that varies the step size
based on the changes of the input signal, as a means of efficient compression. Two
approaches commonly used are forward adaptive quantization and backward adaptive
quantization.

In digital signal processing the quantization process is the necessary and natural follower
of the sampling operation. It is necessary because in practice the digital computer with its
general purpose CPU is used to implement DSP algorithms. And since computers can
only process finite word length (finite resolution/precision) quantities, any infinite
precision continuous valued signal should be quantized to fit a finite resolution, so that it
can be represented (stored) in CPU registers and memory.

We shall be aware of the fact that, it is not the continuous values of the analog function
that inhibits its binary encoding, rather it is the existence of infinitely many such values
due to the definition of continuity,(which therefore requires infinitely many bits to
represent). For example we can design a quantizer such that it represents a signal with a
single bit (just two levels) such that, one level is "pi=3,14..." (say encoded with a 1) and
the other level is "e=2.7183..." ( say encoded with a 0), as we can see, the quantized
values of the signal take on infinite precision, irrational numbers. But there are only two
levels. And we can represent the output of the quantizer with a binary symbol.
Concluding from this we can see that it is not the discreteness of the quantized values that
enable them to be encoded but the finiteness enabling the encoding with finite number of
bits.

In theory there is no relation between quantization values and binary code words used to
encode them (rather than a table that shows the corresponding mapping, just as
examplified above). However due to practical reasons we may tend to use code words
such that their binary mathematical values has a relation with the quantization levels that
is encoded. And this last option merges the first two paragraphs in such a way that, if we
wish to process the output of a quantizer within a DSP/CPU system (which is always the
case) then we can not allow the representation levels of the quantizers to take on arbitrary
values, but only a restricted range such that they can fit in computer registers.
A quantizer is identified with its number of levels M, the decision boundaries {di} and
the corresponding representation values {ri}.

The output of a quantizer has two important properties: 1) a Distortion resulting from the
approximation and 2) a Bit-Rate resulting from binary encoding of its levels. Therefore
the Quantizer design problem is a Rate-Distortion optimization type.

If we are only allowed to use fixed length code for the output level encoding (the
practical case) then the problem reduces into a distortion minimization one.

The design of a quantizer usually means the process to find the sets {di} and {ri} such
that a measure of optimality is satisfied (such as MMSEQ (Minimum Mean Squared
Quantization Error))

Given the number of levels M, the optimal quantizer which minimizes the MSQE with
regards to the given signal statistics is called the Max-Lloyd quantizer, which is a non-
uniform type in general.

The most common quantizer type is the uniform one. It is simple to design and
implement and for most cases it suffices to get satisfactory results. Indeed by the very
inherent nature of the design process, a given quantizer will only produce optimal results
for the assumed signal statistics. Since it is very difficult to correctly predict that in
advance, any static design will never produce actual optimal performance whenever the
input statistics deviates from that of the design assumption. The only solution is to use an
adaptive quantizer.

Quantization error
From Wikipedia, the free encyclopedia
Jump to: navigation, search

In source coding (analog-to-digital conversion and compression), the difference between


the actual analog value and quantized digital value is called quantization error or
quantization distortion. This error is either due to rounding or truncation. The error
signal is sometimes considered as an additional random signal called quantization noise
because of its stochastic behaviour.

Contents
[hide]

• 1 Quantization error models


• 2 Quantization noise model
• 3 Other fields
• 4 See also
• 5 References

• 6 External links

[edit] Quantization error models


In the general case, the original signal is much larger than one LSB. When this happens,
the quantization error is not correlated with the signal, and has a uniform distribution. Its
RMS value is the standard deviation of this distribution, given by .
In the eight-bit ADC example, this represents 0.113% of the full signal range.

At lower levels the quantization error becomes dependent on the input signal, resulting in
distortion. This distortion is created after the anti-aliasing filter, and if these distortions
are above 1/2 the sample rate they will alias back into the audio band. In order to make
the quantization error independent of the input signal, noise with an amplitude of 2 least
significant bits is added to the signal. This slightly reduces signal to noise ratio, but,
ideally, completely eliminates the distortion. It is known as dither.

[edit] Quantization noise model

Quantization noise for a 2-bit ADC. The difference between the blue and red signals in
the upper graph is the quantization error,[dubious – discuss] which is "added" to the quantised
signal and is the source of noise.

Quantization noise is a model of quantization error introduced by quantization in the


analog-to-digital conversion (ADC) in telecommunication systems and signal processing.
It is a rounding error between the analog input voltage to the ADC and the output
digitized value. The noise is non-linear and signal-dependent. It can be modelled in
several different ways.

In an ideal analog-to-digital converter, where the quantization error is uniformly


distributed between −1/2 LSB and +1/2 LSB, and the signal has a uniform distribution
covering all quantization levels, the Signal-to-quantization-noise ratio (SQNR) can be
calculated from
Where Q is the number of quantization bits.

The most common test signals that fulfil this are full amplitude triangle waves and
sawtooth waves.

For example, a 16-bit ADC has a maximum signal-to-noise ratio of 6.02 × 16 = 96.3 dB.

When the input signal is a full-amplitude sine wave the distribution of the signal is no
longer uniform, and the corresponding equation is instead

Here, the quantization noise is once again assumed to be uniformly distributed. When the
input signal has a high amplitude and a wide frequency spectrum this is the case.[1] In this
case a 16-bit ADC has a maximum signal-to-noise ratio of 98.09 dB. The 1.761
difference in signal-to-noise only occurs due to the signal being a full-scale sine wave
instead of a triangle/sawtooth.

(Typical real-life values are worse than this theoretical minimum, due to the addition of
dither to reduce the objectionable effects of quantization, and to imperfections of the
ADC circuitry. On the other hand, specifications often use A-weighted measurements to
hide the inaudible effects of noise shaping, which improves the measurement.)

For complex signals in high-resolution ADCs this is an accurate model. For low-
resolution ADCs, low-level signals in high-resolution ADCs, and for simple waveforms
the quantization noise is not uniformly distributed, making this model inaccurate.[2] In
these cases the quantization noise distribution is strongly affected by the exact amplitude
of the signal.

The calculations above, however, assume a completely filled input channel. If this is not
the case - if the input signal is small - the relative quantization distortion can be very
large. To circumvent this issue, analog compressors and expanders can be used, but these
introduce large amounts of distortion as well, especially if the compressor does not match
the expander.

Quantization Error = Quantized Output - Analog Input

[edit] Other fields


Many physical quantities are actually quantized by physical entities. Examples of fields
where this limitation applies include electronics (due to electrons), optics (due to
photons), biology (due to DNA), and chemistry (due to molecules). This is sometimes
known as the "quantum noise limit" of systems in those fields. This is a different
manifestation of "quantization error," in which theoretical models may be analog but
physically occurs digitally. Around the quantum limit, the distinction between analog and
digital quantities vanishes.[citation needed]

Coherence bandwidth
From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article may require cleanup to meet Wikipedia's quality standards. Please
improve this article if you can. The talk page may contain suggestions. (December
2007)
This article does not cite any references or sources.
Please help improve this article by adding citations to reliable sources. Unsourced material may
be challenged and removed. (May 2009)

Coherence bandwidth is a statistical measurement of the range of frequencies over


which the channel can be considered "flat", or in other words the approximate maximum
bandwidth or frequency interval over which two frequencies of a signal are likely to
experience comparable or correlated amplitude fading. If the multipath time delay spread
equals D seconds, then the coherence bandwidth Wc in rad/s is given approximately by
the equation:

The coherence bandwidth varies over cellular or PCS communications paths because the
multipath spread D varies from path to path.

[edit] Application
Frequencies within a coherence bandwidth of one another tend to all fade in a similar or
correlated fashion. One reason for designing the CDMA IS-95 waveform with a
bandwidth of approximately 1.25 MHz is because in many urban signaling environments
the coherence bandwidth Wc is significantly less than 1.25 MHz. Therefore, when fading
occurs it occurs only over a relatively small fraction of the total CDMA signal
bandwidth. The portion of the signal bandwidth over which fading does not occur
typically contains enough signal power to sustain reliable communications.This is the
bandwidth over which the channel transfer function remains virtually constant.

Nyquist–Shannon sampling theorem


From Wikipedia, the free encyclopedia
Jump to: navigation, search
Fig.1: Hypothetical spectrum of a bandlimited signal as a function of frequency

The Nyquist–Shannon sampling theorem, after Harry Nyquist and Claude Shannon, is
a fundamental result in the field of information theory, in particular telecommunications
and signal processing. Sampling is the process of converting a signal (for example, a
function of continuous time or space) into a numeric sequence (a function of discrete time
or space). Shannon's version of the theorem states:[1]

If a function x(t) contains no frequencies higher than B hertz, it is completely determined


by giving its ordinates at a series of points spaced 1/(2B) seconds apart.

The theorem is commonly called the Nyquist sampling theorem; since it was also
discovered independently by E. T. Whittaker, by Vladimir Kotelnikov, and by others, it is
also known as Nyquist–Shannon–Kotelnikov, Whittaker–Shannon–Kotelnikov,
Whittaker–Nyquist–Kotelnikov–Shannon, WKS, etc., sampling theorem, as well as
the Cardinal Theorem of Interpolation Theory. It is often referred to simply as the
sampling theorem.

In essence, the theorem shows that a bandlimited analog signal that has been sampled can
be perfectly reconstructed from an infinite sequence of samples if the sampling rate
exceeds 2B samples per second, where B is the highest frequency in the original signal. If
a signal contains a component at exactly B hertz, then samples spaced at exactly 1/(2B)
seconds do not completely determine the signal, Shannon's statement notwithstanding.
This sufficient condition can be weakened, as discussed at Sampling of non-baseband
signals below.

More recent statements of the theorem are sometimes careful to exclude the equality
condition; that is, the condition is if x(t) contains no frequencies higher than or equal to
B; this condition is equivalent to Shannon's except when the function includes a steady
sinusoidal component at exactly frequency B.

The theorem assumes an idealization of any real-world situation, as it only applies to


signals that are sampled for infinite time; any time-limited x(t) cannot be perfectly
bandlimited. Perfect reconstruction is mathematically possible for the idealized model but
only an approximation for real-world signals and sampling techniques, albeit in practice
often a very good one.

The theorem also leads to a formula for reconstruction of the original signal. The
constructive proof of the theorem leads to an understanding of the aliasing that can occur
when a sampling system does not satisfy the conditions of the theorem.
The sampling theorem provides a sufficient condition, but not a necessary one, for perfect
reconstruction. The field of compressed sensing provides a stricter sampling condition
when the underlying signal is known to be sparse. Compressed sensing specifically yields
a sub-Nyquist sampling criterion.

Contents
[hide]

• 1 Introduction
• 2 The sampling process
• 3 Reconstruction
• 4 Practical considerations
• 5 Aliasing
• 6 Application to multivariable signals and images
• 7 Downsampling
• 8 Critical frequency
• 9 Mathematical basis for the theorem
• 10 Shannon's original proof
• 11 Sampling of non-baseband signals
• 12 Nonuniform sampling
• 13 Beyond Nyquist
• 14 Historical background
o 14.1 Other discoverers
o 14.2 Why Nyquist?
• 15 See also
• 16 Notes
• 17 References

• 18 External links

[edit] Introduction
A signal or function is bandlimited if it contains no energy at frequencies higher than
some bandlimit or bandwidth B. A signal that is bandlimited is constrained in how
rapidly it changes in time, and therefore how much detail it can convey in an interval of
time. The sampling theorem asserts that the uniformly spaced discrete samples are a
complete representation of the signal if this bandwidth is less than half the sampling rate.
To formalize these concepts, let x(t) represent a continuous-time signal and X(f) be the
continuous Fourier transform of that signal:

The signal x(t) is said to be bandlimited to a one-sided baseband bandwidth, B, if


for all | f | > B,

or, equivalently, supp(X) ⊆ [−B, B].[2] Then the sufficient condition for exact
reconstructability from samples at a uniform sampling rate fs (in samples per unit time) is:

The quantity 2B is called the Nyquist rate and is a property of the bandlimited signal,
while fs/2 is called the Nyquist frequency and is a property of this sampling system.

The time interval between successive samples is referred to as the sampling interval:

and the samples of x(t) are denoted by:

where n is an integer. The sampling theorem leads to a procedure for reconstructing the
original x(t) from the samples and states sufficient conditions for such a reconstruction to
be exact.

[edit] The sampling process


The theorem describes two processes in signal processing: a sampling process, in which a
continuous time signal is converted to a discrete time signal, and a reconstruction
process, in which the original continuous signal is recovered from the discrete time
signal.

The continuous signal varies over time (or space in a digitized image, or another
independent variable in some other application) and the sampling process is performed by
measuring the continuous signal's value every T units of time (or space), which is called
the sampling interval. In practice, for signals that are a function of time, the sampling
interval is typically quite small, on the order of milliseconds, microseconds, or less. This
results in a sequence of numbers, called samples, to represent the original signal. Each
sample value is associated with the instant in time when it was measured. The reciprocal
of the sampling interval (1/T) is the sampling frequency denoted fs, which is measured in
samples per unit of time. If T is expressed in seconds, then fs is expressed in Hz.

[edit] Reconstruction
Reconstruction of the original signal is an interpolation process that mathematically
defines a continuous-time signal x(t) from the discrete samples x[n] and at times in
between the sample instants nT.

Fig.2: The normalized sinc function: sin(πx) / (πx) ... showing the central peak at x= 0,
and zero-crossings at the other integer values of x.

• The procedure: Each sample value is multiplied by the sinc function scaled so
that the zero-crossings of the sinc function occur at the sampling instants and that
the sinc function's central point is shifted to the time of that sample, nT. All of
these shifted and scaled functions are then added together to recover the original
signal. The scaled and time-shifted sinc functions are continuous making the sum
of these also continuous, so the result of this operation is a continuous signal. This
procedure is represented by the Whittaker–Shannon interpolation formula.

• The condition: The signal obtained from this reconstruction process can have no
frequencies higher than one-half the sampling frequency. According to the
theorem, the reconstructed signal will match the original signal provided that the
original signal contains no frequencies at or above this limit. This condition is
called the Nyquist criterion, or sometimes the Raabe condition.

If the original signal contains a frequency component equal to one-half the sampling rate,
the condition is not satisfied. The resulting reconstructed signal may have a component at
that frequency, but the amplitude and phase of that component generally will not match
the original component.

This reconstruction or interpolation using sinc functions is not the only interpolation
scheme. Indeed, it is impossible in practice because it requires summing an infinite
number of terms. However, it is the interpolation method that in theory exactly
reconstructs any given bandlimited x(t) with any bandlimit B < 1/(2T); any other method
that does so is formally equivalent to it.

[edit] Practical considerations


A few consequences can be drawn from the theorem:

• If the highest frequency B in the original signal is known, the theorem gives the
lower bound on the sampling frequency for which perfect reconstruction can be
assured. This lower bound to the sampling frequency, 2B, is called the Nyquist
rate.

• If instead the sampling frequency is known, the theorem gives us an upper bound
for frequency components, B<fs/2, of the signal to allow for perfect
reconstruction. This upper bound is the Nyquist frequency, denoted fN.

• Both of these cases imply that the signal to be sampled must be bandlimited; that
is, any component of this signal which has a frequency above a certain bound
should be zero, or at least sufficiently close to zero to allow us to neglect its
influence on the resulting reconstruction. In the first case, the condition of
bandlimitation of the sampled signal can be accomplished by assuming a model of
the signal which can be analysed in terms of the frequency components it
contains; for example, sounds that are made by a speaking human normally
contain very small frequency components at or above 10 kHz and it is then
sufficient to sample such an audio signal with a sampling frequency of at least
20 kHz. For the second case, we have to assure that the sampled signal is
bandlimited such that frequency components at or above half of the sampling
frequency can be neglected. This is usually accomplished by means of a suitable
low-pass filter; for example, if it is desired to sample speech waveforms at 8 kHz,
the signals should first be lowpass filtered to below 4 kHz.

• In practice, neither of the two statements of the sampling theorem described


above can be completely satisfied, and neither can the reconstruction formula be
precisely implemented. The reconstruction process that involves scaled and
delayed sinc functions can be described as ideal. It cannot be realized in practice
since it implies that each sample contributes to the reconstructed signal at almost
all time points, requiring summing an infinite number of terms. Instead, some
type of approximation of the sinc functions, finite in length, has to be used. The
error that corresponds to the sinc-function approximation is referred to as
interpolation error. Practical digital-to-analog converters produce neither scaled
and delayed sinc functions nor ideal impulses (that if ideally low-pass filtered
would yield the original signal), but a sequence of scaled and delayed rectangular
pulses. This practical piecewise-constant output can be modeled as a zero-order
hold filter driven by the sequence of scaled and delayed dirac impulses referred to
in the mathematical basis section below. A shaping filter is sometimes used after
the DAC with zero-order hold to make a better overall approximation.

• Furthermore, in practice, a signal can never be perfectly bandlimited, since ideal


"brick-wall" filters cannot be realized. All practical filters can only attenuate
frequencies outside a certain range, not remove them entirely. In addition to this, a
"time-limited" signal can never be bandlimited. This means that even if an ideal
reconstruction could be made, the reconstructed signal would not be exactly the
original signal. The error that corresponds to the failure of bandlimitation is
referred to as aliasing.

• The sampling theorem does not say what happens when the conditions and
procedures are not exactly met, but its proof suggests an analytical framework in
which the non-ideality can be studied. A designer of a system that deals with
sampling and reconstruction processes needs a thorough understanding of the
signal to be sampled, in particular its frequency content, the sampling frequency,
how the signal is reconstructed in terms of interpolation, and the requirement for
the total reconstruction error, including aliasing, sampling, interpolation and other
errors. These properties and parameters may need to be carefully tuned in order to
obtain a useful system.

[edit] Aliasing
Main article: Aliasing

The Poisson summation formula shows that the samples, x[n]=x(nT), of function x(t) are
sufficient to create a periodic summation of function X(f). The result is:

(Eq.1)

As depicted in Figures 3, 4, and 8, copies of X(f) are shifted by multiples of fs and


combined by addition.

Fig.3: Hypothetical spectrum of a properly sampled bandlimited signal (blue) and images
(green) that do not overlap. A "brick-wall" low-pass filter can remove the images and
leave the original spectrum, thus recovering the original signal from the samples.

If the sampling condition is not satisfied, adjacent copies overlap, and it is not possible in
general to discern an unambiguous X(f). Any frequency component above fs/2 is
indistinguishable from a lower-frequency component, called an alias, associated with one
of the copies. The reconstruction technique described below produces the alias, rather
than the original component, in such cases.
Fig.4 Top: Hypothetical spectrum of an insufficiently sampled bandlimited signal (blue),
X(f), where the images (green) overlap. These overlapping edges or "tails" of the images
add, creating a spectrum unlike the original. Bottom: Hypothetical spectrum of a
marginally sufficiently sampled bandlimited signal (blue), XA(f), where the images
(green) narrowly do not overlap. But the overall sampled spectrum of XA(f) is identical to
the overall inadequately sampled spectrum of X(f) (top) because the sum of baseband and
images are the same in both cases. The discrete sampled signals xA[n] and x[n] are also
identical. It is not possible, just from examining the spectra (or the sampled signals), to
tell the two situations apart. If this were an audio signal, xA[n] and x[n] would sound the
same and the presumed "properly" sampled xA[n] would be the alias of x[n] since the
spectrum XA(f) masquerades as the spectrum X(f).

For a sinusoidal component of exactly half the sampling frequency, the component will in
general alias to another sinusoid of the same frequency, but with a different phase and
amplitude.

To prevent or reduce aliasing, two things can be done:

1. Increase the sampling rate, to above twice some or all of the frequencies that are
aliasing.
2. Introduce an anti-aliasing filter or make the anti-aliasing filter more stringent.

The anti-aliasing filter is to restrict the bandwidth of the signal to satisfy the condition for
proper sampling. Such a restriction works in theory, but is not precisely satisfiable in
reality, because realizable filters will always allow some leakage of high frequencies.
However, the leakage energy can be made small enough so that the aliasing effects are
negligible.

[edit] Application to multivariable signals and images


Fig.5: Subsampled image showing a Moiré pattern

Fig.6: See full size image

The sampling theorem is usually formulated for functions of a single variable.


Consequently, the theorem is directly applicable to time-dependent signals and is
normally formulated in that context. However, the sampling theorem can be extended in a
straightforward way to functions of arbitrarily many variables. Grayscale images, for
example, are often represented as two-dimensional arrays (or matrices) of real numbers
representing the relative intensities of pixels (picture elements) located at the
intersections of row and column sample locations. As a result, images require two
independent variables, or indices, to specify each pixel uniquely — one for the row, and
one for the column.

Color images typically consist of a composite of three separate grayscale images, one to
represent each of the three primary colors — red, green, and blue, or RGB for short.
Other colorspaces using 3-vectors for colors include HSV, LAB, XYZ, etc. Some
colorspaces such as cyan, magenta, yellow, and black (CMYK) may represent color by
four dimensions. All of these are treated as vector-valued functions over a two-
dimensional sampled domain.

Similar to one-dimensional discrete-time signals, images can also suffer from aliasing if
the sampling resolution, or pixel density, is inadequate. For example, a digital photograph
of a striped shirt with high frequencies (in other words, the distance between the stripes is
small), can cause aliasing of the shirt when it is sampled by the camera's image sensor.
The aliasing appears as a moiré pattern. The "solution" to higher sampling in the spatial
domain for this case would be to move closer to the shirt, use a higher resolution sensor,
or to optically blur the image before acquiring it with the sensor.

Another example is shown to the left in the brick patterns. The top image shows the
effects when the sampling theorem's condition is not satisfied. When software rescales an
image (the same process that creates the thumbnail shown in the lower image) it, in
effect, runs the image through a low-pass filter first and then downsamples the image to
result in a smaller image that does not exhibit the moiré pattern. The top image is what
happens when the image is downsampled without low-pass filtering: aliasing results.

The application of the sampling theorem to images should be made with care. For
example, the sampling process in any standard image sensor (CCD or CMOS camera) is
relatively far from the ideal sampling which would measure the image intensity at a
single point. Instead these devices have a relatively large sensor area at each sample point
in order to obtain sufficient amount of light. In other words, any detector has a finite-
width point spread function. The analog optical image intensity function which is
sampled by the sensor device is not in general bandlimited, and the non-ideal sampling is
itself a useful type of low-pass filter, though not always sufficient to remove enough high
frequencies to sufficiently reduce aliasing. When the area of the sampling spot (the size
of the pixel sensor) is not large enough to provide sufficient anti-aliasing, a separate anti-
aliasing filter (optical low-pass filter) is typically included in a camera system to further
blur the optical image. Despite images having these problems in relation to the sampling
theorem, the theorem can be used to describe the basics of down and up sampling of
images.

[edit] Downsampling
When a signal is downsampled, the sampling theorem can be invoked via the artifice of
resampling a hypothetical continuous-time reconstruction. The Nyquist criterion must
still be satisfied with respect to the new lower sampling frequency in order to avoid
aliasing. To meet the requirements of the theorem, the signal must usually pass through a
low-pass filter of appropriate cutoff frequency as part of the downsampling operation.
This low-pass filter, which prevents aliasing, is called an anti-aliasing filter.

[edit] Critical frequency


Fig.7: A family of sinusoids at the critical frequency, all having the same sample
sequences of alternating +1 and –1. That is, they all are aliases of each other, even though
their frequency is not above half the sample rate.

To illustrate the necessity of fs > 2B, consider the sinusoid:

With fs = 2B or equivalently T = 1/(2B), the samples are given by:

Those samples cannot be distinguished from the samples of:

But for any θ such that sin(θ) ≠ 0, x(t) and xA(t) have different amplitudes and different
phase. This and other ambiguities are the reason for the strict inequality of the sampling
theorem's condition.

[edit] Mathematical basis for the theorem

Fig.8: Spectrum, Xs(f), of a properly sampled bandlimited signal (blue) and images
(green) that do not overlap. A brick-wall low-pass filter, H(f), removes the images, leaves
the original spectrum, X(f), and recovers the original signal from the samples.

From Figures 3 and 8, it is apparent that when there is no overlap of the copies (aka
"images") of X(f), the k = 0 term of Xs(f) can be recovered by the product:

where:
H(f) need not be precisely defined in the region [B, fs − B] because Xs(f) is zero in that
region. However, the worst case is when B = fs/2, the Nyquist frequency. A function that
is sufficient for that and all less severe cases is:

where rect(u) is the rectangular function.

Therefore:

(from Eq.1, above).

The original function that was sampled can be recovered by an inverse Fourier transform:

[3]

which is the Whittaker–Shannon interpolation formula. It shows explicitly how the


samples, x(nT), can be combined to reconstruct x(t).

• From Figure 8, it is clear that larger-than-necessary values of fs (smaller values of


T), called oversampling, have no effect on the outcome of the reconstruction and
have the benefit of leaving room for a transition band in which H(f) is free to take
intermediate values. Undersampling, which causes aliasing, is not in general a
reversible operation.
• Theoretically, the interpolation formula can be implemented as a low pass filter,
whose impulse response is sinc(t/T) and whose input is
which is a Dirac comb function modulated by
the signal samples. Practical digital-to-analog converters (DAC) implement an
approximation like the zero-order hold. In that case, oversampling can reduce the
approximation error.

[edit] Shannon's original proof


The original proof presented by Shannon is elegant and quite brief, but it offers less
intuitive insight into the subtleties of aliasing, both unintentional and intentional. Quoting
Shannon's original paper, which uses f for the function, F for the spectrum, and W for the
bandwidth limit:

Let F(ω) be the spectrum of f(t). Then

since F(ω) is assumed to be zero outside the band W. If we let

where n is any positive or negative integer, we obtain

On the left are values of f(t) at the sampling points. The integral on the right will
be recognized as essentially the nth coefficient in a Fourier-series expansion of
the function F(ω), taking the interval –W to W as a fundamental period. This
means that the values of the samples f(n / 2W) determine the Fourier coefficients
in the series expansion of F(ω). Thus they determine F(ω), since F(ω) is zero for
frequencies greater than W, and for lower frequencies F(ω) is determined if its
Fourier coefficients are determined. But F(ω) determines the original function f(t)
completely, since a function is determined if its spectrum is known. Therefore the
original samples determine the function f(t) completely.

Shannon's proof of the theorem is complete at that point, but he goes on to discuss
reconstruction via sinc functions, what we now call the Whittaker–Shannon interpolation
formula as discussed above. He does not derive or prove the properties of the sinc
function, but these would have been familiar to engineers reading his works at the time,
since the Fourier pair relationship between rect (the rectangular function) and sinc was
well known. Quoting Shannon:

Let xn be the nth sample. Then the function f(t) is represented by:
As in the other proof, the existence of the Fourier transform of the original signal is
assumed, so the proof does not say whether the sampling theorem extends to bandlimited
stationary random processes.

[edit] Sampling of non-baseband signals


As discussed by Shannon:[1]

A similar result is true if the band does not start at zero frequency but at
some higher value, and can be proved by a linear translation
(corresponding physically to single-sideband modulation) of the zero-
frequency case. In this case the elementary pulse is obtained from sin(x)/x
by single-side-band modulation.

That is, a sufficient no-loss condition for sampling signals that do not have baseband
components exists that involves the width of the non-zero frequency interval as opposed
to its highest frequency component. See Sampling (signal processing) for more details
and examples.

A bandpass condition is that X(f) = 0, for all nonnegative f outside the open band of
frequencies:

for some nonnegative integer N. This formulation includes the normal baseband condition
as the case N=0.

The corresponding interpolation function is the impulse response of an ideal brick-wall


bandpass filter (as opposed to the ideal brick-wall lowpass filter used above) with cutoffs
at the upper and lower edges of the specified band, which is the difference between a pair
of lowpass impulse responses:

Other generalizations, for example to signals occupying multiple non-contiguous bands,


are possible as well. Even the most generalized form of the sampling theorem does not
have a provably true converse. That is, one cannot conclude that information is
necessarily lost just because the conditions of the sampling theorem are not satisfied;
from an engineering perspective, however, it is generally safe to assume that if the
sampling theorem is not satisfied then information will most likely be lost.

[edit] Nonuniform sampling


The sampling theory of Shannon can be generalized for the case of nonuniform samples,
that is, samples not taken equally spaced in time. Shannon sampling theory for non-
uniform sampling states that a band-limited signal can be perfectly reconstructed from its
samples if the average sampling rate satisfies the Nyquist condition[4]. Therefore,
although uniformly spaced samples may result in easier reconstruction algorithms, it is
not a necessary condition for perfect reconstruction.

[edit] Beyond Nyquist


The Nyquist–Shannon sampling theorem provides a sufficient condition for the sampling
and reconstruction of a band-limited signal. When reconstruction is done via the
Whittaker–Shannon interpolation formula, the Nyquist criterion is also a necessary
condition to avoid aliasing, in the sense that if samples are taken at a slower rate than
twice the band limit, then there are some signals that will not be correctly reconstructed.
However, if further restrictions are imposed on the signal, then the Nyquist criterion may
no longer be a necessary condition.

A non-trivial example of exploiting extra assumptions about the signal is given by the
recent field of compressed sensing, which allows for full reconstruction with a sub-
Nyquist sampling rate. Specifically, this applies to signals that are sparse (or
compressible) in some domain. As an example, compressed sensing deals with signals
that may have a low over-all bandwidth (say, the effective bandwidth EB), but the
frequency components are spread out in the overall bandwidth B, rather than all together
in a single band, so that the passband technique doesn't apply. In other words, the
frequency spectrum is sparse. Traditionally, the necessary sampling rate is thus B / 2.
Using compressed sensing techniques, the signal could be perfectly reconstructed if it is
sampled at a rate slightly greater than the EB / 2. The downside of this approach is that
reconstruction is no longer given by a formula, but instead by the solution to a convex
optimization program which requires well-studied but nonlinear methods.

[edit] Historical background


The sampling theorem was implied by the work of Harry Nyquist in 1928 ("Certain
topics in telegraph transmission theory"), in which he showed that up to 2B independent
pulse samples could be sent through a system of bandwidth B; but he did not explicitly
consider the problem of sampling and reconstruction of continuous signals. About the
same time, Karl Küpfmüller showed a similar result,[5] and discussed the sinc-function
impulse response of a band-limiting filter, via its integral, the step response Integralsinus;
this bandlimiting and reconstruction filter that is so central to the sampling theorem is
sometimes referred to as a Küpfmüller filter (but seldom so in English).

The sampling theorem, essentially a dual of Nyquist's result, was proved by Claude E.
Shannon in 1949 ("Communication in the presence of noise"). V. A. Kotelnikov
published similar results in 1933 ("On the transmission capacity of the 'ether' and of
cables in electrical communications", translation from the Russian), as did the
mathematician E. T. Whittaker in 1915 ("Expansions of the Interpolation-Theory",
"Theorie der Kardinalfunktionen"), J. M. Whittaker in 1935 ("Interpolatory function
theory"), and Gabor in 1946 ("Theory of communication").

[edit] Other discoverers

Others who have independently discovered or played roles in the development of the
sampling theorem have been discussed in several historical articles, for example by
Jerri[6] and by Lüke.[7] For example, Lüke points out that H. Raabe, an assistant to
Küpfmüller, proved the theorem in his 1939 Ph.D. dissertation; the term Raabe condition
came to be associated with the criterion for unambiguous representation (sampling rate
greater than twice the bandwidth).

Meijering[8] mentions several other discoverers and names in a paragraph and pair of
footnotes:

As pointed out by Higgins [135], the sampling theorem should really be considered in
two parts, as done above: the first stating the fact that a bandlimited function is
completely determined by its samples, the second describing how to reconstruct the
function using its samples. Both parts of the sampling theorem were given in a somewhat
different form by J. M. Whittaker [350, 351, 353] and before him also by Ogura [241,
242]. They were probably not aware of the fact that the first part of the theorem had been
stated as early as 1897 by Borel [25].27 As we have seen, Borel also used around that time
what became known as the cardinal series. However, he appears not to have made the
link [135]. In later years it became known that the sampling theorem had been presented
before Shannon to the Russian communication community by Kotel'nikov [173]. In more
implicit, verbal form, it had also been described in the German literature by Raabe [257].
Several authors [33, 205] have mentioned that Someya [296] introduced the theorem in
the Japanese literature parallel to Shannon. In the English literature, Weston [347]
introduced it independently of Shannon around the same time.28
27
Several authors, following Black [16], have claimed that this first part of the sampling
theorem was stated even earlier by Cauchy, in a paper [41] published in 1841. However,
the paper of Cauchy does not contain such a statement, as has been pointed out by
Higgins [135].
28
As a consequence of the discovery of the several independent introductions of the
sampling theorem, people started to refer to the theorem by including the names of the
aforementioned authors, resulting in such catchphrases as “the Whittaker-Kotel’nikov-
Shannon (WKS) sampling theorem" [155] or even "the Whittaker-Kotel'nikov-Raabe-
Shannon-Someya sampling theorem" [33]. To avoid confusion, perhaps the best thing to
do is to refer to it as the sampling theorem, "rather than trying to find a title that does
justice to all claimants" [136].

[edit] Why Nyquist?


Exactly how, when, or why Harry Nyquist had his name attached to the sampling
theorem remains obscure. The term Nyquist Sampling Theorem (capitalized thus)
appeared as early as 1959 in a book from his former employer, Bell Labs,[9] and appeared
again in 1963,[10] and not capitalized in 1965.[11] It had been called the Shannon Sampling
Theorem as early as 1954,[12] but also just the sampling theorem by several other books in
the early 1950s.

In 1958, Blackman and Tukey[13] cited Nyquist's 1928 paper as a reference for the
sampling theorem of information theory, even though that paper does not treat sampling
and reconstruction of continuous signals as others did. Their glossary of terms includes
these entries:

Sampling theorem (of information theory)


Nyquist's result that equi-spaced data, with two or more points per cycle of
highest frequency, allows reconstruction of band-limited functions. (See Cardinal
theorem.)
Cardinal theorem (of interpolation theory)
A precise statement of the conditions under which values given at a doubly
infinite set of equally spaced points can be interpolated to yield a continuous
band-limited function with the aid of the function

Exactly what "Nyquist's result" they are referring to remains mysterious.

When Shannon stated and proved the sampling theorem in his 1949 paper, according to
Meijering[8] "he referred to the critical sampling interval T = 1/(2W) as the Nyquist
interval corresponding to the band W, in recognition of Nyquist’s discovery of the
fundamental importance of this interval in connection with telegraphy." This explains
Nyquist's name on the critical interval, but not on the theorem.

Similarly, Nyquist's name was attached to Nyquist rate in 1953 by Harold S. Black:[14]

"If the essential frequency range is limited to B cycles per second, 2B was given
by Nyquist as the maximum number of code elements per second that could be
unambiguously resolved, assuming the peak interference is less half a quantum
step. This rate is generally referred to as signaling at the Nyquist rate and 1/(2B)
has been termed a Nyquist interval." (bold added for emphasis; italics as in the
original)

According to the OED, this may be the origin of the term Nyquist rate. In Black's usage,
it is not a sampling rate, but a signaling rate.

[edit]
Sampling (signal processing)
From Wikipedia, the free encyclopedia
Jump to: navigation, search

Signal sampling representation. The continuous signal is represented with a green color
whereas the discrete samples are in blue.

In signal processing, sampling is the reduction of a continuous signal to a discrete signal.


A common example is the conversion of a sound wave (a continuous-time signal) to a
sequence of samples (a discrete-time signal).

A sample refers to a value or set of values at a point in time and/or space.

A sampler is a subsystem or operation that extracts samples from a continuous signal. A


theoretical ideal sampler produces samples equivalent to the instantaneous value of the
continuous signal at the desired points.

Contents
[hide]

• 1 Theory
o 1.1 Observation period
• 2 Practical implications
• 3 Applications
o 3.1 Audio sampling
 3.1.1 Sampling rate
 3.1.2 Bit depth (quantization)
 3.1.3 Speech sampling
o 3.2 Video sampling
• 4 Undersampling
• 5 Oversampling
• 6 Complex sampling
o 6.1 Notes
• 7 See also
• 8 References

• 9 External links

[edit] Theory
See also: Nyquist–Shannon sampling theorem

For convenience, we will discuss signals which vary with time. However, the same
results can be applied to signals varying in space or in any other dimension and similar
results are obtained in two or more dimensions.

Let x(t) be a continuous signal which is to be sampled, and that sampling is performed by
measuring the value of the continuous signal every T seconds, which is called the
sampling interval. Thus, the sampled signal x[n] given by:

x[n] = x(nT), with n = 0, 1, 2, 3, ...

The sampling frequency or sampling rate fs is defined as the number of samples obtained
in one second, or fs = 1/T. The sampling rate is measured in hertz or in samples per
second.

We can now ask: under what circumstances is it possible to reconstruct the original signal
completely and exactly (perfect reconstruction)?

A partial answer is provided by the Nyquist–Shannon sampling theorem, which provides


a sufficient (but not always necessary) condition under which perfect reconstruction is
possible. The sampling theorem guarantees that bandlimited signals (i.e., signals which
have a maximum frequency) can be reconstructed perfectly from their sampled version, if
the sampling rate is more than twice the maximum frequency. Reconstruction in this case
can be achieved using the Whittaker–Shannon interpolation formula.

The frequency equal to one-half of the sampling rate is therefore a bound on the highest
frequency that can be unambiguously represented by the sampled signal. This frequency
(half the sampling rate) is called the Nyquist frequency of the sampling system.
Frequencies above the Nyquist frequency fN can be observed in the sampled signal, but
their frequency is ambiguous. That is, a frequency component with frequency f cannot be
distinguished from other components with frequencies NfN + f and NfN – f for nonzero
integers N. This ambiguity is called aliasing. To handle this problem as gracefully as
possible, most analog signals are filtered with an anti-aliasing filter (usually a low-pass
filter with cutoff near the Nyquist frequency) before conversion to the sampled discrete
representation.
[edit] Observation period

The observation period is the span of time during which a series of data samples are
collected at regular intervals. More broadly, it can refer to any specific period during
which a set of data points is gathered, regardless of whether or not the data is periodic in
nature. Thus a researcher might study the incidence of earthquakes and tsunamis over a
particular time period, such as a year or a century.

The observation period is simply the span of time during which the data is studied,
regardless of whether data so gathered represents a set of discrete events having arbitrary
timing within the interval, or whether the samples are explicitly bound to specified sub-
intervals.

[edit] Practical implications


In practice, the continuous signal is sampled using an analog-to-digital converter (ADC),
a non-ideal device with various physical limitations. This results in deviations from the
theoretically perfect reconstruction capabilities, collectively referred to as distortion.

Various types of distortion can occur, including:

• Aliasing. A precondition of the sampling theorem is that the signal be


bandlimited. However, in practice, no time-limited signal can be bandlimited.
Since signals of interest are almost always time-limited (e.g., at most spanning the
lifetime of the sampling device in question), it follows that they are not
bandlimited. However, by designing a sampler with an appropriate guard band, it
is possible to obtain output that is as accurate as necessary.
• Integration effect or aperture effect. This results from the fact that the sample is
obtained as a time average within a sampling region, rather than just being equal
to the signal value at the sampling instant. The integration effect is readily
noticeable in photography when the exposure is too long and creates a blur in the
image. An ideal camera would have an exposure time of zero. In a capacitor-
based sample and hold circuit, the integration effect is introduced because the
capacitor cannot instantly change voltage thus requiring the sample to have non-
zero width.
• Jitter or deviation from the precise sample timing intervals.
• Noise, including thermal sensor noise, analog circuit noise, etc.
• Slew rate limit error, caused by an inability for an ADC output value to change
sufficiently rapidly.
• Quantization as a consequence of the finite precision of words that represent the
converted values.
• Error due to other non-linear effects of the mapping of input voltage to converted
output value (in addition to the effects of quantization).

The conventional, practical digital-to-analog converter (DAC) does not output a sequence
of dirac impulses (such that, if ideally low-pass filtered, result in the original signal
before sampling) but instead output a sequence of piecewise constant values or
rectangular pulses. This means that there is an inherent effect of the zero-order hold on
the effective frequency response of the DAC resulting in a mild roll-off of gain at the
higher frequencies (a 3.9224 dB loss at the Nyquist frequency). This zero-order hold
effect is a consequence of the hold action of the DAC and is not due to the sample and
hold that might precede a conventional ADC as is often misunderstood. The DAC can
also suffer errors from jitter, noise, slewing, and non-linear mapping of input value to
output voltage.

Jitter, noise, and quantization are often analyzed by modeling them as random errors
added to the sample values. Integration and zero-order hold effects can be analyzed as a
form of low-pass filtering. The non-linearities of either ADC or DAC are analyzed by
replacing the ideal linear function mapping with a proposed nonlinear function.

[edit] Applications
[edit] Audio sampling

Digital audio uses pulse-code modulation and digital signals for sound reproduction. This
includes analog-to-digital conversion (ADC), digital-to-analog conversion (DAC),
storage, and transmission. In effect, the system commonly referred to as digital is in fact a
discrete-time, discrete-level analog of a previous electrical analog. While modern systems
can be quite subtle in their methods, the primary usefulness of a digital system is the
ability to store, retrieve and transmit signals without any loss of quality.

[edit] Sampling rate

When it is necessary to capture audio covering the entire 20–20,000 Hz range of human
hearing, such as when recording music or many types of acoustic events, audio
waveforms are typically sampled at 44.1 kHz (CD), 48 kHz (professional audio), or
96 kHz. The approximately double-rate requirement is a consequence of the Nyquist
theorem.

There has been an industry trend towards sampling rates well beyond the basic
requirements; 96 kHz and even 192 kHz are available.[1] This is in contrast with
laboratory experiments, which have failed to show that ultrasonic frequencies are audible
to human observers; however in some cases ultrasonic sounds do interact with and
modulate the audible part of the frequency spectrum (intermodulation distortion). It is
noteworthy that intermodulation distortion is not present in the live audio and so it
represents an artificial coloration to the live sound.[2]

One advantage of higher sampling rates is that they can relax the low-pass filter design
requirements for ADCs and DACs, but with modern oversampling sigma-delta converters
this advantage is less important.
[edit] Bit depth (quantization)

Audio is typically recorded at 8-, 16-, and 20-bit depth, which yield a theoretical
maximum signal to quantization noise ratio (SQNR) for a pure sine wave of,
approximately, 49.93 dB, 98.09 dB and 122.17 dB [3]. Eight-bit audio is generally not
used due to prominent and inherent quantization noise (low maximum SQNR), although
the A-law and u-law 8-bit encodings pack more resolution into 8 bits while increase total
harmonic distortion. CD quality audio is recorded at 16-bit. In practice, not many
consumer stereos can produce more than about 90 dB of dynamic range, although some
can exceed 100 dB. Thermal noise limits the true number of bits that can be used in
quantization. Few analog systems have signal to noise ratios (SNR) exceeding 120 dB;
consequently, few situations will require more than 20-bit quantization.

For playback and not recording purposes, a proper analysis of typical programme levels
throughout an audio system reveals that the capabilities of well-engineered 16-bit
material far exceed those of the very best hi-fi systems, with the microphone noise and
loudspeaker headroom being the real limiting factors[citation needed].

[edit] Speech sampling

Speech signals, i.e., signals intended to carry only human speech, can usually be sampled
at a much lower rate. For most phonemes, almost all of the energy is contained in the
5Hz-4 kHz range, allowing a sampling rate of 8 kHz. This is the sampling rate used by
nearly all telephony systems, which use the G.711 sampling and quantization
specifications.

[edit] Video sampling

Standard-definition television (SDTV) uses either 720 by 480 pixels (US NTSC 525-line)
or 704 by 576 pixels (UK PAL 625-line) for the visible picture area.

High-definition television (HDTV) is currently moving towards three standards referred


to as 720p (progressive), 1080i (interlaced) and 1080p (progressive, also known as Full-
HD) which all 'HD-Ready' sets will be able to display.

[edit] Undersampling
Plot of sample rates (y axis) versus the upper edge frequency (x axis) for a band of width
1; grays areas are combinations that are "allowed" in the sense that no two frequencies in
the band alias to same frequency. The darker gray areas correspond to undersampling
with the lowest allowable sample rate.
Main article: Undersampling

When one samples a bandpass signal at a rate lower than the Nyquist rate, the samples
are equal to samples of a low-frequency alias of the high-frequency signal; the original
signal will still be uniquely represented and recoverable if the spectrum of its alias does
not cross over half the sampling rate. Such undersampling is also known as bandpass
sampling, harmonic sampling, IF sampling, and direct IF to digital conversion.[4]

[edit] Oversampling
Oversampling is used in most modern analog-to-digital converters to reduce the
distortion introduced by practical digital-to-analog converters, such as a zero-order hold
instead of idealizations like the Whittaker–Shannon interpolation formula.

[edit] Complex sampling


Complex sampling refers to the simultaneous sampling of two different, but related,
waveforms, resulting in pairs of samples that are subsequently treated as complex
numbers. Usually one waveform is the Hilbert transform of the other waveform
and the complex-valued function, is called an
analytic signal, whose Fourier transform is zero for all negative values of frequency. In
that case, the Nyquist rate for a waveform with no frequencies ≥ B can be reduced to just
B (complex samples/sec), instead of 2B (real samples/sec).[note 1] More apparently, the
equivalent baseband waveform, also has a Nyquist rate of B, because
all of its non-zero frequency content is shifted into the interval [-B/2, B/2).

Although complex-valued samples can be obtained as described above, they are much
more commonly created by manipulating samples of a real-valued waveform. For
instance, the equivalent baseband waveform can be created without explicitly computing
[note 2]
by processing the product sequence through a
[note 3]
digital lowpass filter whose cutoff frequency is B/2. Computing only every other
sample of the output sequence reduces the sample-rate commensurate with the reduced
Nyquist rate. The result is half as many complex-valued samples as the original number
of real samples. No information is lost, and the original s(t) waveform can be recovered,
if necessary.

[edit]

Signal-to-quantization-noise ratio
From Wikipedia, the free encyclopedia
Jump to: navigation, search
It has been suggested that this article or section be merged with Quantization
error. (Discuss)

Signal-to-Quantization-Noise Ratio (SQNR or SNqR) is widely used quality measure in


analysing digitizing schemes such as PCM (pulse code modulation) and multimedia
codecs. The SQNR reflects the relationship between the maximum nominal signal
strength and the quantization error (also known as quantization noise) introduced in the
analog-to-digital conversion.

The SQNR formula is derived from the general SNR (Signal-to-Noise Ratio) formula for
the binary pulse-code modulated communication channel:

where

Pe is the probability of received bit error


m(t) is the peak message signal level
is the mean message signal level

As SQNR applies to quantized signals, the formulae for SQNR refer to discrete-time
digital signals. Instead of m(t), we will use the digitized signal x(n). For N quantization
steps, each sample, x requires ν = log2N bits. The probability distribution function (pdf)
representing the distribution of values in x and can be denoted as f(x). The maximum
magnitude value of any x is denoted by xmax.

As SQNR, like SNR, is a ratio of signal power to some noise power, it can be calculated
as:
The signal power is:

The quantization noise power can be expressed as:

Giving:

When the SQNR is desired in terms of Decibels (dB), a useful approximation to SQNR
is:

where ν is the number of bits in a quantized sample, and is the signal power
calculated above. Note that for each bit added to a sample, the SQNR goes up by
approximately 6dB ( ).

In telecommunication, the term signal compression has the following meanings:

In analog (usually audio) systems, reduction of the dynamic range of a signal by


controlling it as a function of the inverse relationship of its instantaneous value relative to
a specified reference level.

Signal compression is usually expressed in dB.

Instantaneous values of the input signal that are low, relative to the reference level, are
increased, and those that are high are decreased.

Signal compression is usually accomplished by separate devices called "compressors." It


is used for many purposes, such as (a) improving signal-to-noise ratios prior to digitizing
an analog signal for transmission over a digital carrier system, (b) preventing overload of
succeeding elements of a system, or (c) matching the dynamic ranges of two devices.
Signal compression (in dB) may be a linear or nonlinear function of the signal level
across the frequency band of interest and may be essentially instantaneous or have fixed
or variable delay times.

Signal compression always introduces distortion, which is usually not objectionable, if


the compression is limited to a few dB.

The original dynamic range of a compressed signal may be restored by a circuit called an
"expander".

This article incorporates public domain material from the General Services
Administration document "Federal Standard 1037C" (in support of MIL-STD-188).

Modified discrete cosine transform


From Wikipedia, the free encyclopedia
Jump to: navigation, search

The modified discrete cosine transform (MDCT) is a Fourier-related transform based


on the type-IV discrete cosine transform (DCT-IV), with the additional property of being
lapped: it is designed to be performed on consecutive blocks of a larger dataset, where
subsequent blocks are overlapped so that the last half of one block coincides with the first
half of the next block. This overlapping, in addition to the energy-compaction qualities of
the DCT, makes the MDCT especially attractive for signal compression applications,
since it helps to avoid artifacts stemming from the block boundaries. As a result of these
advantages, the MDCT is employed in most modern lossy audio formats, including MP3,
AC-3, Vorbis, Windows Media Audio, ATRAC, Cook, and AAC.

The MDCT was proposed by Princen, Johnson, and Bradley in 1987, following earlier
(1986) work by Princen and Bradley to develop the MDCT's underlying principle of
time-domain aliasing cancellation (TDAC), described below. (There also exists an
analogous transform, the MDST, based on the discrete sine transform, as well as other,
rarely used, forms of the MDCT based on different types of DCT or DCT/DST
combinations.)

In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a
32-band polyphase quadrature filter (PQF) bank. The output of this MDCT is
postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF
filter bank. Such a combination of a filter bank with an MDCT is called a hybrid filter
bank or a subband MDCT. AAC, on the other hand, normally uses a pure MDCT; only
the (rarely used) MPEG-4 AAC-SSR variant (by Sony) uses a four-band PQF bank
followed by an MDCT. Similar to MP3, ATRAC uses stacked quadrature mirror filters
(QMF) followed by an MDCT.
Contents
[hide]

• 1 Definition
o 1.1 Inverse transform
o 1.2 Computation
• 2 Window functions
• 3 Relationship to DCT-IV and Origin of TDAC
o 3.1 Origin of TDAC
o 3.2 TDAC for the windowed MDCT
• 4 See also

• 5 References

[edit] Definition
As a lapped transform, the MDCT is a bit unusual compared to other Fourier-related
transforms in that it has half as many outputs as inputs (instead of the same number). In
particular, it is a linear function (where R denotes the set of real
numbers). The 2N real numbers x0, ..., x2N-1 are transformed into the N real numbers X0, ...,
XN-1 according to the formula:

(The normalization coefficient in front of this transform, here unity, is an arbitrary


convention and differs between treatments. Only the product of the normalizations of the
MDCT and the IMDCT, below, is constrained.)

[edit] Inverse transform

The inverse MDCT is known as the IMDCT. Because there are different numbers of
inputs and outputs, at first glance it might seem that the MDCT should not be invertible.
However, perfect invertibility is achieved by adding the overlapped IMDCTs of
subsequent overlapping blocks, causing the errors to cancel and the original data to be
retrieved; this technique is known as time-domain aliasing cancellation (TDAC).

The IMDCT transforms N real numbers X0, ..., XN-1 into 2N real numbers y0, ..., y2N-1
according to the formula:
(Like for the DCT-IV, an orthogonal transform, the inverse has the same form as the
forward transform.)

In the case of a windowed MDCT with the usual window normalization (see below), the
normalization coefficient in front of the IMDCT should be multiplied by 2 (i.e.,
becoming 2/N).

[edit] Computation

Although the direct application of the MDCT formula would require O(N2) operations, it
is possible to compute the same thing with only O(N log N) complexity by recursively
factorizing the computation, as in the fast Fourier transform (FFT). One can also compute
MDCTs via other transforms, typically a DFT (FFT) or a DCT, combined with O(N) pre-
and post-processing steps. Also, as described below, any algorithm for the DCT-IV
immediately provides a method to compute the MDCT and IMDCT of even size.

[edit] Window functions


In typical signal-compression applications, the transform properties are further improved
by using a window function wn (n = 0, ..., 2N-1) that is multiplied with xn and yn in the
MDCT and IMDCT formulas, above, in order to avoid discontinuities at the n = 0 and 2N
boundaries by making the function go smoothly to zero at those points. (That is, we
window the data before the MDCT and after the IMDCT.) In principle, x and y could
have different window functions, and the window function could also change from one
block to the next (especially for the case where data blocks of different sizes are
combined), but for simplicity we consider the common case of identical window
functions for equal-sized blocks.

The transform remains invertible (that is, TDAC works), for a symmetric window wn =
w2N-1-n, as long as w satisfies the Princen-Bradley condition:

Various different window functions are common, e.g.

for MP3 and MPEG-2 AAC, and


for Vorbis. AC-3 uses a Kaiser-Bessel derived (KBD) window, and MPEG-4 AAC can
also use a KBD window.

Note that windows applied to the MDCT are different from windows used for other types
of signal analysis, since they must fulfill the Princen-Bradley condition. One of the
reasons for this difference is that MDCT windows are applied twice, for both the MDCT
(analysis) and the IMDCT (synthesis).

[edit] Relationship to DCT-IV and Origin of TDAC


As can be seen by inspection of the definitions, for even N the MDCT is essentially
equivalent to a DCT-IV, where the input is shifted by N/2 and two N-blocks of data are
transformed at once. By examining this equivalence more carefully, important properties
like TDAC can be easily derived.

In order to define the precise relationship to the DCT-IV, one must realize that the DCT-
IV corresponds to alternating even/odd boundary conditions: even at its left boundary
(around n=–1/2), odd at its right boundary (around n=N–1/2), and so on (instead of
periodic boundaries as for a DFT). This follows from the identities

and

.
Thus, if its inputs are an array x of length N, we can imagine extending this array to (x, –
xR, –x, xR, ...) and so on, where xR denotes x in reverse order.

Consider an MDCT with 2N inputs and N outputs, where we divide the inputs into four
blocks (a, b, c, d) each of size N/2. If we shift these by N/2 (from the +N/2 term in the
MDCT definition), then (b, c, d) extend past the end of the N DCT-IV inputs, so we must
"fold" them back according to the boundary conditions described above.

Thus, the MDCT of 2N inputs (a, b, c, d) is exactly equivalent to a DCT-IV of the


N inputs: (–cR–d, a–bR), where R denotes reversal as above.

(In this way, any algorithm to compute the DCT-IV can be trivially applied to the
MDCT.)

Similarly, the IMDCT formula above is precisely 1/2 of the DCT-IV (which is its own
inverse), where the output is shifted by N/2 and extended (via the boundary conditions) to
a length 2N. The inverse DCT-IV would simply give back the inputs (–cR–d, a–bR) from
above. When this is shifted and extended via the boundary conditions, one obtains:

IMDCT(MDCT(a, b, c, d)) = (a–bR, b–aR, c+dR, d+cR) / 2.


Half of the IMDCT outputs are thus redundant, as b–aR = –(a–bR)R, and likewise for the
last two terms.

One can now understand how TDAC works. Suppose that one computes the MDCT of
the subsequent, 50% overlapped, 2N block (c, d, e, f). The IMDCT will then yield,
analogous to the above: (c–dR, d–cR, e+fR, eR+f) / 2. When this is added with the previous
IMDCT result in the overlapping half, the reversed terms cancel and one obtains simply
(c, d), recovering the original data.

[edit] Origin of TDAC

The origin of the term "time-domain aliasing cancellation" is now clear. The use of input
data that extend beyond the boundaries of the logical DCT-IV causes the data to be
aliased in exactly the same way that frequencies beyond the Nyquist frequency are
aliased to lower frequencies, except that this aliasing occurs in the time domain instead of
the frequency domain. Hence the combinations c–dR and so on, which have precisely the
right signs for the combinations to cancel when they are added.

For odd N (which are rarely used in practice), N/2 is not an integer so the MDCT is not
simply a shift permutation of a DCT-IV. In this case, the additional shift by half a sample
means that the MDCT/IMDCT becomes equivalent to the DCT-III/II, and the analysis is
analogous to the above.

[edit] TDAC for the windowed MDCT

Above, the TDAC property was proved for the ordinary MDCT, showing that adding
IMDCTs of subsequent blocks in their overlapping half recovers the original data. The
derivation of this inverse property for the windowed MDCT is only slightly more
complicated.

Recall from above that when (a,b,c,d) and (c,d,e,f) are MDCTed, IMDCTed, and added
in their overlapping half, we obtain (c + dR,cR + d) / 2 + (c − dR,d − cR) / 2 = (c,d), the
original data.

Now we suppose that we multiply both the MDCT inputs and the IMDCT outputs by a
window function of length 2N. As above, we assume a symmetric window function,
which is therefore of the form (w,z,zR,wR) where w and z are length-N/2 vectors and R
denotes reversal as before. Then the Princen-Bradley condition can be written:
, with the multiplications and additions performed elementwise,
or equivalently (reversing w and z).

Therefore, instead of MDCTing (a,b,c,d), we now MDCT (wa,zb,zRc,wRd) (with all


multiplications performed elementwise). When this is IMDCTed and multiplied again
(elementwise) by the window function, the last-N half becomes:
.

(Note that we no longer have the multiplication by 1/2, because the IMDCT
normalization differs by a factor of 2 in the windowed case.)

Similarly, the windowed MDCT and IMDCT of (c,d,e,f) yields, in its first-N half:

When we add these two halves together, we obtain:

recovering the original data.

Quadrature mirror filter


From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article does not cite any references or sources.
Please help improve this article by adding citations to reliable sources. Unsourced material may
be challenged and removed. (July 2009)

In digital signal processing, a quadrature mirror filter is a filter most commonly used
to implement a filter bank that splits an input signal into two bands. The resulting high-
pass and low-pass signals are often reduced by a factor of 2, giving a critically sampled
two-channel representation of the original signal.

The analysis filters are related by the following formulae:

| H0(ejΩ) | 2 + | H1(ejΩ) | 2 = 1

where Ω is the frequency, and the sampling rate is normalized to 2π.

In other words, the power sum of the high-pass and low-pass filters is equal to 1. The
filter responses are symmetric about Ω = π / 2

| H1(ejΩ) | = | H0(ej(π − Ω)) |


Orthogonal wavelets -- the Haar wavelets and related Daubechies wavelets, Coiflets, and
some developed by Mallat, are generated by scaling functions which, with the wavelet,
satisfy a quadrature mirror filter relationship.

You might also like