Delta Modulation For Voice Transmission: Application Note January 1997 AN607.1
Delta Modulation For Voice Transmission: Application Note January 1997 AN607.1
Introduction
Delta modulation has evolved into a simple, efcient method of digitizing voice for secure, reliable communications and for voice I/O in data processing. To illustrate basic principles, a very simple delta modulator and demodulator are illustrated in Figure 1. The modulator is a sampled data system employing a negative feedback loop. A comparator senses whether or not the instantaneous level of the analog voice input is greater or less than the feedback signal. The comparator output is clocked by a ip-op to form a continuous NRZ digital data stream. This digital data is also integrated and fed back to the comparator. The feedback system is such that the integrator ramps up and down to produce a rough approximation of the input waveform. An identical integrator in the demodulator produces the same waveform, which when ltered, reproduces the voice. One can see that the digital data 0s and 1s are commands to the integrators to go up or go down respectively. Another way of looking at it is that the digital data stream also has analog signicance; it approximates the differential of the voice, since analog integration of the data reproduces the voice. Note that the integrator output never stands still; it always travels either up or down by a xed amount in any clock period. Because of its xed integrator output slope, the simple delta modulator is less than ideal for encoding human voice which may have a wide dynamic amplitude range. The integrator cannot track large, high frequency signals with its xed slope. Fortunately, human speech has statistically smaller amplitudes at higher frequencies, therefore an integrator time constant of about 1ms will satisfactorily reproduce voice in a 3kHz bandwidth. A more serious limitation is that voice amplitude changes which are less than the height of the integrator ramp during one clock period cannot be resolved. So dynamic range is proportional to clock frequency, and satisfactory range cannot be obtained at desirable low clock rates. A means of effectively increasing dynamic range is called companding (compressing-expanding); where at the modulator, small signals are given higher relative gain, and an inverse characteristic is produced at the demodulator. The CVSD: A popular effective scheme for companded delta modulation is known as CVSD (continuously variable slop deltamod) shown in Figure 2. Additional digital logic, a second integrator, and an analog multiplier are added to the simple modulator. Under small input signal conditions, the second integrator (known as the syllabic lter) has no input, and circuit function is identical to the simple modulator, except that the multiplier is biased to output quite small ramp amplitudes giving good resolution to the small signals. A larger signal input is characterized by consecutive strings of 1s or 0s in the data as the integrator attempts to track the input. The logic input to the syllabic lter actuates whenever 3 or more consecutive 0s or 1s are present in the data. When this happens, the syllabic lter output starts to build up increasing the multiplier gain, passing larger amplitude ramps to the comparator, enabling the system to track the larger signal. Up to a limit, the more consecutive 1s or 0s generated, the larger the ramp amplitude. Since the larger signals increase the negative feedback of the modulator and the forward gain of the demodulator, companding takes place. By listening tests, the syllabic lter time constant of 4 to 10ms is generally considered optimum. An outstanding characteristic of CVSD is its ability, with fairly simple circuitry, to transmit intelligible voice at relatively low data rates. Companded PCM, for telephone quality transmission, requires about 64K bits/sec data rate per channel. CVSD produces equal quality at 32K bits/sec. (However, at this rate it does not handle tone signals or phase encoded modern transmissions as well.) CVSD is useful at even lower data rates. At 16K bits/sec the reconstructed voice is remarkably natural, but has a slightly Fuzzy Edge. At 9.6K bits/sec intelligibility is still excellent, although the sound is reminiscent of a damaged loudspeaker. Of course, very sophisticated speech compression techniques have been used to transmit speech at even lower data rates; but CVSD is an excellent compromise between circuit simplicity and bandwidth economy.
4-1
FIGURE 1.
FIGURE 2.
The CMOS digital circuit functions of Figure 3 closely parallel the equivalent analog function in Figure 2. The lters are single pole recursive types using shift registers with feedback. A digital multiplier feeds a 10-bit R-2R DAC which reconstructs the voice waveform. The DAC output is in steps, rather than ramps.
4-2
The digital CVSD has a number of advantages over its analog counterpart, and has desirable features which would otherwise require additional circuitry: 1) The all CMOS device requires only 1mA current from a single +4.5V to +7V supply. 2) No bulky external precision resistors or capacitors are required for the integrators; time constants of the digital lters are set by the clock frequency and do not drift with time or temperature. 3) For best intelligibility and freedom from listener fatigue, it is important that the recovered audio is quiet during the pauses between spoken words. During quiet periods, an alternate 1, 0 pattern should be encoded, which when decoded and ltered will be inaudible. Achieving this in the analog CVSD requires that up and down ramp slopes are precisely equal and that offsets in the comparator and ampliers are adjusted to zero. Improper adjustment or excessive component drift can result in noisy oscillations. In the digital design, comparator offset and drift are adjusted by a long up-down counter summed to the DAC to insure that over a period of time equal numbers of 1s and 0s are generated. An added feature is automatic quieting, where if the DAC input would be less than 2 LSBs the quieting pattern is generated instead. This has proven to aid intelligibility. 4) To prevent momentary overload when beginning to encode or decode, it is desirable to initialize the integrators. In the analog CVSD, external analog switches would be required to discharge the capacitors. In the digital CVSD, the lters are reset by momentarily putting the Force Zero pin low. At the same time, a quieting pattern is generated without affecting internal encoding by putting the Alternate Plain Text pin low. 5) In some analog CVSD designs, transient noise will be generated during recovery from a low frequency overdriven input condition. The digital CVSD has a clipped output with instant recovery, when overdriven.
3) Audio Delay Lines: Although charge-coupled deviced (CCD) will perform this function, they are still expensive and choice of congurations is quite limited. Also, there is a practical limit to the number of CCD stages, since each introduces a slight degradation to the signal. As shown in Figure 5, the delay line consists of a CVSD modulator, a shift register and a demodulator. Delay is proportional to the number of register stages divided by the clock frequency. This can be used in speech scrambling, as explained above, echo suppression in PA systems; special echo effects; music enhancement or synthesis; and recursive or nonrecursive ltering.
4-3
4) Voice I/O: Digitized speech can be entered into a computer for storage, voice identication, or word recognition. Words stored in ROMs, disc memory, etc. can be used for voice output. CVSD, since it can operate at low data rates, is more efcient in storage requirements than PCM or other A to D conversions. Also, the data is in a useful form for ltering or other processing. Figure 6 illustrates a simple evaluation breadboard circuit for the HC-55564. A single device is sufcient to evaluate sound quality, etc. since, when encoding, the feedback signal at pin 3 is identical to the decoded signal from a receiver. The following are some pointers for using the devices:
1) Power supply decoupling is essential with the capacitor (C1 in Figure 6) located close to the I.C. 2) Power to the I.C. must be present before the audio input, the clock, or other digital inputs are applied. Failure to observe this may result in a latchup condition, which is usually not destructive and may be removed by cycling the supply off, then on. 3) Signal ground (pin 2) should be externally connected to pin 8 and power ground. It is recommended for noise-free operation that the audio input and output ground returns connect directly to pin 2 and to no other grounds in the system. Pins 6 and 7 must be open circuited. 4) Digital inputs and outputs are similar to and compatible with standard CMOS logic circuits using the same supply voltage. The illustrated 10K pullup resistors are necessary only with mechanical switches, and are not necessary when driving these pins with CMOS. Unused digital inputs should be tied to the appropriate supply rail for the
4-4
0.5ms/DIV VOICE IN = 250Hz, 4VP-P SINE WAVE CLOCK = 16kHz FIGURE 7. CVSD LARGE SIGNAL SINE WAVE RECONSTRUCTION
0.2ms/DIV VOICE IN = 1kHz, 0.15VP-P SINE WAVE CLOCK = 16kHz FIGURE 8. CVSD SMALL SIGNAL SINE WAVE RECONSTRUCTION
0.5ms/DIV VOICE IN = 250Hz, 6VP-P SINE WAVE CLOCK = 16kHz FIGURE 9. CVSD LARGE SIGNAL, LOW FREQUENCY CLIPPED WAVEFORM
50ms/DIV VOICE IN = 0 CLOCK = 16kHz FIGURE 10. CVSD ZERO SIGNAL IDLE PATTERN
0.2ms/DIV VOICE IN = 1kHz, 6VP-P SINE WAVE CLOCK = 16kHz FIGURE 11. CVSD LARGE SIGNAL, HIGH FREQUENCY SLEW LIMITING
4-5
All Intersil semiconductor products are manufactured, assembled and tested under ISO9000 quality systems certication.
Intersil semiconductor products are sold by description only. Intersil Corporation reserves the right to make changes in circuit design and/or specifications at any time without notice. Accordingly, the reader is cautioned to verify that data sheets are current before placing orders. Information furnished by Intersil is believed to be accurate and reliable. However, no responsibility is assumed by Intersil or its subsidiaries for its use; nor for any infringements of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of Intersil or its subsidiaries. For information regarding Intersil Corporation and its products, see web site www.intersil.com
4-6