Data Communication Basics: What Is Data Communications
Data Communication Basics: What Is Data Communications
The distance over which data moves within a computer may vary from a few thousandths of an
inch, as is the case within a single IC chip, to as much as several feet along the backplane of the
main circuit board. Over such small distances, digital data may be transmitted as direct, two-
level electrical signals over simple copper conductors. Except for the fastest computers, circuit
designers are not very concerned about the shape of the conductor or the analog characteristics of
signal transmission.
Frequently, however, data must be sent beyond the local circuitry that constitutes a computer. In
many cases, the distances involved may be enormous. Unfortunately, as the distance between the
source of a message and its destination increases, accurate transmission becomes increasingly
difficult. This results from the electrical distortion of signals traveling through long conductors,
and from noise added to the signal as it propagates through a transmission medium. Although
some precautions must be taken for data exchange within a computer, the biggest problems occur
when data is transferred to devices outside the computer's circuitry. In this case, distortion and
noise can become so severe that information is lost.
Data Communications concerns the transmission of digital messages to devices external to the
message source. "External" devices are generally thought of as being independently powered
circuitry that exists beyond the chassis of a computer or other digital message source. As a rule,
the maximum permissible transmission rate of a message is directly proportional to signal power,
and inversely proportional to channel noise. It is the aim of any communications system to
provide the highest possible transmission rate at the lowest possible power and with the least
possible noise.
Communications Channel
The message source is the transmitter, and the destination is the receiver. A channel whose
direction of transmission is unchanging is referred to as a simplex channel. For example, a radio
station is a simplex channel because it always transmits the signal to its listeners and never
allows them to transmit back.
A half-duplex channel is a single physical channel in which the direction may be reversed.
Messages may flow in two directions, but never at the same time, in a half-duplex system. In a
telephone call, one party speaks while the other listens. After a pause, the other party speaks and
the first party listens. Speaking simultaneously results in garbled sound that cannot be
understood.
Serial Communications
Most digital messages are vastly longer than just a few bits. Because it is neither practical nor
economic to transfer all bits of a long message simultaneously, the message is broken into
smaller parts and transmitted sequentially. Bit-serial transmission conveys a message one bit at a
time through a channel. Each bit represents a part of the message. The individual bits are then
reassembled at the destination to compose the message. In general, one channel will pass only
one bit at a time. Thus, bit-serial transmission is necessary in data communications if only a
single channel is available. Bit-serial transmission is normally just called serial transmission and
is the chosen communications method in many computer peripherals.
Byte-serial transmission conveys eight bits at a time through eight parallel channels. Although
the raw transfer rate is eight times faster than in bit-serial transmission, eight channels are
needed, and the cost may be as much as eight times higher to transmit the message. When
distances are short, it may nonetheless be both feasible and economic to use parallel channels in
return for high data rates. The popular Centronics printer interface is a case where byte-serial
transmission is used. As another example, it is common practice to use a 16-bit-wide data bus to
transfer data between a microprocessor and memory chips; this provides the equivalent of 16
parallel channels. On the other hand, when communicating with a timesharing system over a
modem, only a single channel is available, and bit-serial transmission is required. This figure
illustrates these ideas:
The baud rate refers to the signalling rate at which data is sent through a channel and is measured
in electrical transitions per second. In the EIA232 serial interface standard, one signal transition,
at most, occurs per bit, and the baud rate and bit rate are identical. In this case, a rate of 9600
baud corresponds to a transfer of 9,600 data bits per second with a bit period of 104
microseconds (1/9600 sec.). If two electrical transitions were required for each bit, as is the case
in non-return-to-zero coding, then at a rate of 9600 baud, only 4800 bits per second could be
conveyed. The channel efficiency is the number of bits of useful information passed through the
channel per second. It does not include framing, formatting, and error detecting bits that may be
added to the information bits before a message is transmitted, and will always be less than one.
The data rate of a channel is often specified by its bit rate (often thought erroneously to be the
same as baud rate). However, an equivalent measure channel capacity is bandwidth. In general,
the maximum data rate a channel can support is directly proportional to the channel's bandwidth
and inversely proportional to the channel's noise level.
A communications protocol is an agreed-upon convention that defines the order and meaning of
bits in a serial transmission. It may also specify a procedure for exchanging messages. A
protocol will define how many data bits compose a message unit, the framing and formatting
bits, any error-detecting bits that may be added, and other information that governs control of the
communications hardware. Channel efficiency is determined by the protocol design rather than
by digital hardware considerations. Note that there is a tradeoff between channel efficiency and
reliability - protocols that provide greater immunity to noise by adding error-detecting and
-correcting codes must necessarily become less efficient.
Serialized data is not generally sent at a uniform rate through a channel. Instead, there is usually
a burst of regularly spaced binary data bits followed by a pause, after which the data flow
resumes. Packets of binary data are sent in this manner, possibly with variable-length pauses
between packets, until the message has been fully transmitted. In order for the receiving end to
know the proper moment to read individual binary bits from the channel, it must know exactly
when a packet begins and how much time elapses between bits. When this timing information is
known, the receiver is said to be synchronized with the transmitter, and accurate data transfer
becomes possible. Failure to remain synchronized throughout a transmission will cause data to
be corrupted or lost.
Two basic techniques are employed to ensure correct synchronization. In synchronous systems,
separate channels are used to transmit data and timing information. The timing channel transmits
clock pulses to the receiver. Upon receipt of a clock pulse, the receiver reads the data channel
and latches the bit value found on the channel at that moment. The data channel is not read again
until the next clock pulse arrives. Because the transmitter originates both the data and the timing
pulses, the receiver will read the data channel only when told to do so by the transmitter (via the
clock pulse), and synchronization is guaranteed.
Techniques exist to merge the timing signal with the data so that only a single channel is
required. This is especially useful when synchronous transmissions are to be sent through a
modem. Two methods in which a data signal is self-timed are nonreturn-to-zero and biphase
Manchester coding. These both refer to methods for encoding a data stream into an electrical
waveform for transmission.
In asynchronous systems, a separate timing channel is not used. The transmitter and receiver
must be preset in advance to an agreed-upon baud rate. A very accurate local oscillator within
the receiver will then generate an internal clock signal that is equal to the transmitter's within a
fraction of a percent. For the most common serial protocol, data is sent in small packets of 10 or
11 bits, eight of which constitute message information. When the channel is idle, the signal
voltage corresponds to a continuous logic '1'. A data packet always begins with a logic '0' (the
start bit) to signal the receiver that a transmission is starting. The start bit triggers an internal
timer in the receiver that generates the needed clock pulses. Following the start bit, eight bits of
message data are sent bit by bit at the agreed upon baud rate. The packet is concluded with a
parity bit and stop bit. One complete packet is illustrated below:
The packet length is short in asynchronous systems to minimize the risk that the local oscillators
in the receiver and transmitter will drift apart. When high-quality crystal oscillators are used,
synchronization can be guaranteed over an 11-bit period. Every time a new packet is sent, the
start bit resets the synchronization, so the pause between packets can be arbitrarily long. Note
that the EIA232 standard defines electrical, timing, and mechanical characteristics of a serial
interface. However, it does not include the asynchronous serial protocol shown in the previous
figure, or the ASCII alphabet described next.
The ASCII Character Set
Characters sent through a serial interface generally follow the ASCII (American Standard Code
for Information Interchange) character standard:
This standard relates binary codes to printable characters and control codes. Fully 25 percent of
the ASCII character set represents nonprintable control codes, such as carriage return (CR) and
line feed (LF). Most modern character-oriented peripheral equipment abides by the ASCII
standard, and thus may be used interchangeably with different computers.
Noise and momentary electrical disturbances may cause data to be changed as it passes through a
communications channel. If the receiver fails to detect this, the received message will be
incorrect, resulting in possibly serious consequences. As a first line of defense against data
errors, they must be detected. If an error can be flagged, it might be possible to request that the
faulty packet be resent, or to at least prevent the flawed data from being taken as correct. If
sufficient redundant information is sent, one- or two-bit errors may be corrected by hardware
within the receiver before the corrupted data ever reaches its destination.
A parity bit is added to a data packet for the purpose of error detection. In the even-parity
convention, the value of the parity bit is chosen so that the total number of '1' digits in the
combined data plus parity packet is an even number. Upon receipt of the packet, the parity
needed for the data is recomputed by local hardware and compared to the parity bit received with
the data. If any bit has changed state, the parity will not match, and an error will have been
detected. In fact, if an odd number of bits (not just one) have been altered, the parity will not
match. If an even number of bits have been reversed, the parity will match even though an error
has occurred. However, a statistical analysis of data communication errors has shown that a
single-bit error is much more probable than a multibit error in the presence of random noise.
Thus, parity is a reliable method of error detection.
Another approach to error detection involves the computation of a checksum. In this case, the
packets that constitute a message are added arithmetically. A checksum number is appended to
the packet sequence so that the sum of data plus checksum is zero. When received, the packet
sequence may be added, along with the checksum, by a local microprocessor. If the sum is
nonzero, an error has occurred. As long as the sum is zero, it is highly unlikely (but not
impossible) that any data has been corrupted during transmission.
Errors may not only be detected, but also corrected if additional code is added to a packet
sequence. If the error probability is high or if it is not possible to request retransmission, this may
be worth doing. However, including error-correcting code in a transmission lowers channel
efficiency, and results in a noticeable drop in channel throughput.
Data Compression
If a typical message were statistically analyzed, it would be found that certain characters are used
much more frequently than others. By analyzing a message before it is transmitted, short binary
codes may be assigned to frequently used characters and longer codes to rarely used characters.
In doing so, it is possible to reduce the total number of characters sent without altering the
information in the message. Appropriate decoding at the receiver will restore the message to its
original form. This procedure, known as data compression, may result in a 50 percent or greater
savings in the amount of data transmitted. Even though time is necessary to analyze the message
before it is transmitted, the savings may be great enough so that the total time for compression,
transmission, and decompression will still be lower than it would be when sending an
uncompressed message.
Some kinds of data will compress much more than others. Data that represents images, for
example, will usually compress significantly, perhaps by as much as 80 percent over its original
size. Data representing a computer program, on the other hand, may be reduced only by 15 or 20
percent.
A compression method called Huffman coding is frequently used in data communications, and
particularly in fax transmission. Clearly, most of the image data for a typical business letter
represents white paper, and only about 5 percent of the surface represents black ink. It is possible
to send a single code that, for example, represents a consecutive string of 1000 white pixels
rather than a separate code for each white pixel. Consequently, data compression will
significantly reduce the total message length for a faxed business letter. Were the letter made up
of randomly distributed black ink covering 50 percent of the white paper surface, data
compression would hold no advantages.
Data Encryption
Privacy is a great concern in data communications. Faxed business letters can be intercepted at
will through tapped phone lines or intercepted microwave transmissions without the knowledge
of the sender or receiver. To increase the security of this and other data communications,
including digitized telephone conversations, the binary codes representing data may be
scrambled in such a way that unauthorized interception will produce an indecipherable sequence
of characters. Authorized receive stations will be equipped with a decoder that enables the
message to be restored. The process of scrambling, transmitting, and descrambling is known as
encryption.
Custom integrated circuits have been designed to perform this task and are available at low cost.
In some cases, they will be incorporated into the main circuitry of a data communications device
and function without operator knowledge. In other cases, an external circuit is used so that the
device, and its encrypting/decrypting technique, may be transported easily.
Data is typically grouped into packets that are either 8, 16, or 32 bits long, and passed between
temporary holding units called registers. Data within a register is available in parallel because
each bit exits the register on a separate conductor. To transfer data from one register to another,
the output conductors of one register are switched onto a channel of parallel wires referred to as
a bus. The input conductors of another register, which is also connected to the bus, capture the
information:
Following a data transaction, the content of the source register is reproduced in the destination
register. It is important to note that after any digital data transfer, the source and destination
registers are equal; the source register is not erased when the data is sent.
The transmit and receive switches shown above are electronic and operate in response to
commands from a central control unit. It is possible that two or more destination registers will be
switched on to receive data from a single source. However, only one source may transmit data
onto the bus at any time. If multiple sources were to attempt transmission simultaneously, an
electrical conflict would occur when bits of opposite value are driven onto a single bus
conductor. Such a condition is referred to as a bus contention. Not only will a bus contention
result in the loss of information, but it also may damage the electronic circuitry. As long as all
registers in a system are linked to one central control unit, bus contentions should never occur if
the circuit has been designed properly. Note that the data buses within a typical microprocessor
are funda-mentally half-duplex channels.
When the source and destination registers are part of an integrated circuit (within a
microprocessor chip, for example), they are extremely close (thousandths of an inch).
Consequently, the bus signals are at very low power levels, may traverse a distance in very little
time, and are not very susceptible to external noise and distortion. This is the ideal environment
for digital communications. However, it is not yet possible to integrate all the necessary circuitry
for a computer (i.e., CPU, memory, disk control, video and display drivers, etc.) on a single chip.
When data is sent off-chip to another integrated circuit, the bus signals must be amplified and
conductors extended out of the chip through external pins. Amplifiers may be added to the
source register:
Bus signals that exit microprocessor chips and other VLSI circuitry are electrically capable of
traversing about one foot of conductor on a printed circuit board, or less if many devices are
connected to it. Special buffer circuits may be added to boost the bus signals sufficiently for
transmission over several additional feet of conductor length, or for distribution to many other
chips (such as memory chips).
Because of the very high switching rate and relatively low signal strength found on data, address,
and other buses within a computer, direct extension of the buses beyond the confines of the main
circuit board or plug-in boards would pose serious problems. First, long runs of electrical
conductors, either on printed circuit boards or through cables, act like receiving antennas for
electrical noise radiated by motors, switches, and electronic circuits:
Such noise becomes progressively worse as the length increases, and may eventually impose an
unacceptable error rate on the bus signals. Just a single bit error in transferring an instruction
code from memory to a microprocessor chip may cause an invalid instruction to be introduced
into the instruction stream, in turn causing the computer to totally cease operation.
A second problem involves the distortion of electrical signals as they pass through metallic
conductors. Signals that start at the source as clean, rectangular pulses may be received as
rounded pulses with ringing at the rising and falling edges:
These effects are properties of transmission through metallic conductors, and become more
pronounced as the conductor length increases. To compensate for distortion, signal power must
be increased or the transmission rate decreased.
Special amplifier circuits are designed for transmitting direct (unmodulated) digital signals
through cables. For the relatively short distances between components on a printed circuit board
or along a computer backplane, the amplifiers are in simple IC chips that operate from standard
+5v power. The normal output voltage from the amplifier for logic '1' is slightly higher than the
minimum needed to pass the logic '1' threshold. Correspondingly for logic '0', it is slightly lower.
The difference between the actual output voltage and the threshold value is referred to as the
noise margin, and represents the amount of noise voltage that can be added to the signal without
creating an error:
Computer peripherals such as a printer or scanner generally include mechanisms that cannot be
situated within the computer itself. Our first thought might be just to extend the computer's
internal buses with a cable of sufficient length to reach the peripheral. Doing so, however, would
expose all bus transactions to external noise and distortion even though only a very small
percentage of these transactions concern the distant peripheral to which the bus is connected.
If a peripheral can be located within 20 feet of the computer, however, relatively simple
electronics may be added to make data transfer through a cable efficient and reliable. To
accomplish this, a bus interface circuit is installed in the computer:
It consists of a holding register for peripheral data, timing and formatting circuitry for external
data transmission, and signal amplifiers to boost the signal sufficiently for transmission through a
cable. When communication with the peripheral is necessary, data is first deposited in the
holding register by the microprocessor. This data will then be reformatted, sent with error-
detecting codes, and transmitted at a relatively slow rate by digital hardware in the bus interface
circuit. In addition, the signal power is greatly boosted before transmission through the cable.
These steps ensure that the data will not be corrupted by noise or distortion during its passage
through the cable. In addition, because only data destined for the peripheral is sent, the party-line
transactions taking place on the computer's buses are not unnecessarily exposed to noise.
Data sent in this manner may be transmitted in byte-serial format if the cable has eight parallel
channels (at least 10 conductors for half-duplex operation), or in bit-serial format if only a single
channel is available.
When relatively long distances are involved in reaching a peripheral device, driver circuits must
be inserted after the bus interface unit to compensate for the electrical effects of long cables:
This is the only change needed if a single peripheral is used. However, if many peripherals are
connected, or if other computer stations are to be linked, a local area network (LAN) is required,
and it becomes necessary to drastically change both the electrical drivers and the protocol to send
messages through the cable. Because multiconductor cable is expensive, bit-serial transmission is
almost always used when the distance exceeds 20 feet.
In either a simple extension cable or a LAN, a balanced electrical system is used for transmitting
digital data through the channel. This type of system involves at least two wires per channel,
neither of which is a ground. Note that a common ground return cannot be shared by multiple
channels in the same cable as would be possible in an unbalanced system.
The basic idea behind a balanced circuit is that a digital signal is sent on two wires
simultaneously, one wire expressing a positive voltage image of the signal and the other a
negative voltage image. When both wires reach the destination, the signals are subtracted by a
summing amplifier, producing a signal swing of twice the value found on either incoming line. If
the cable is exposed to radiated electrical noise, a small voltage of the same polarity is added to
both wires in the cable. When the signals are subtracted by the summing amplifier, the noise
cancels and the signal emerges from the cable without noise:
A great deal of technology has been developed for LAN systems to minimize the amount of
cable required and maximize the throughput. The costs of a LAN have been concentrated in the
electrical interface card that would be installed in PCs or peripherals to drive the cable, and in the
communications software, not in the cable itself (whose cost has been minimized). Thus, the cost
and complexity of a LAN are not particularly affected by the distance between stations.
Data communications through the telephone network can reach any point in the world. The
volume of overseas fax transmissions is increasing constantly, and computer networks that link
thousands of businesses, governments, and universities are pervasive. Transmissions over such
distances are not generally accomplished with a direct-wire digital link, but rather with digitally-
modulated analog carrier signals. This technique makes it possible to use existing analog
telephone voice channels for digital data, although at considerably reduced data rates compared
to a direct digital link.
Transmission of data from your personal computer to a timesharing service over phone lines
requires that data signals be converted to audible tones by a modem. An audio sine wave carrier
is used, and, depending on the baud rate and protocol, will encode data by varying the frequency,
phase, or amplitude of the carrier. The receiver's modem accepts the modulated sine wave and
extracts the digital data from it. Several modulation techniques typically used in encoding digital
data for analog transmission are shown below:
Similar techniques may be used in digital storage devices such as hard disk drives to encode data
for storage using an analog medium.