CE162-lecture-notes-part1
CE162-lecture-notes-part1
1
Learning outcomes
Demonstrate an understanding of number systems, and
conversion methods between number bases, including fixed and
floating-point binary.
Design digital circuits incorporating higher-level logic elements
such as counters, registers and multiplexers.
Perform forensic byte-level interpretation of data file contents using
standard tools.
Describe and implement a serial data transmission system.
Select appropriate system parameters for digital representation of
analogue signals.
Be aware of methods to perform simple manipulations of digital
image and audio data.
2
Topics to be covered
Revision of binary number representation and binary arithmetic.
Number conversions between arbitrary bases.
Representation of floating-point binary numbers and applications in
digital systems; accuracy of numerical calculations.
Use of Karnaugh map techniques in logic design, including variable-
entered map logic design.
Sequential logic: Asynchronous and synchronous counters; finite
state machines and their design; sequence detectors.
Asynchronous serial data transmission.
Analogue-to-digital and digital-to-analogue conversion techniques.
Sampling in theory (Nyquist's theorem) and in practice (sample and
hold circuits). Quantization and quantization accuracy.
Frequency spectra and frequency domain representation of
sampled data. Fourier series.
Data compression concepts.
3
Reading material
4
N.Thomos CE162 Digital Electronic Systems 1.2
Definition:
where the number system base is r, and the {di} are the individual digits, which can take values 0,1,..., (r-1).
Examples:
258.6310 = 2 x 102 + 5 x 101 + 8 x 100 + 6 x 10-1 + 3 x 10-2
hundreds tens units tenths hundredths
Integer + Fraction
Exercise: Convert the binary and octal examples to their decimal integer + fraction representation
The process involves repeated integer division by the base, collecting remainders as you proceed. To see how and why
this works, consider the (pointless!) 'conversion' of a number from base 10 to base 10:
e.g. 1023:
Conversion to Binary:
e.g. 4210:
Exercises:
Convert the following to Binary: 47 ,1023 , 32769
10
10
School of Computer Science and Electronic Engineering University of Essex
10
N. Thomos CE162 Digital Electronic Systems 1.4
Counting in Binary
(IMPORTANT: You should memorise the bit patterns for the numbers representing 0 to 15. The same applies to the
integer powers of two, at least up to 216 = 65536)
Octal is particularly useful as an intermediate for manually converting decimal to binary, as it involves fewer divisions
and there is less chance for error. (Using hexadecimal is more difficult, unless you also know the sixteen times table!)
Example
Hence the octal conversion is 570605168 . Note that the final division is redundant - stop when the value of the fraction
is less than 1, i.e., the numerator is less than the denominator (base).
If you now want the number in hexadecimal, rearrange the bits in groups of four:
From memory, or looking up from the first table on the previous page, gives
BC614E16
Exercises
Convert to Decimal (the convention in the C programming language is for Octal numbers to start with '0' and hexadecimal '0x'):
It can also be represented as 4/8 + 1/8 = 1/2 + 1/8 = 2-1 + 2 -3 = 0.5 + 0.125.
The general rule for converting a fractional number to its binary equivalent can again be seen by doing the redundant
process in decimal. We take the original number, and multiply by the base. The value of the integer that is obtained
becomes a digit of the fraction, starting with the most significant. The remaining fractional part is multiplied by the base
again, and so on until we are left with zero, but this does not always happen.
*It's interesting to note that measurement or unit systems that predate the 'decimal' generally do not have ten as one of their factors: most
Imperial measurement in inches uses divisions of 1/2, 1/4, 1/8 and so on, as already mentioned. Twelve is a favourite: 12 inches to the foot,
12 (old) pence to a shilling, 240 old pence to a pound sterling. Twelve has the advantage of having more integer divisors than ten: two,
three (most useful) four and six. If evolution had given us eight, twelve or sixteen fingers, things might have been different.
Binary arithmetic
Multiplication
Repeated addition can be used: to compute AxB add A to itself B times (or B to itself A times, whichever needs fewer
operations.)
Either way, this is inefficient, so usually the 'shift and add' principle of conventional long multiplication is adapted for
binary; the implementation is simpler than for non-binary and can be done in physical hardware or 'algorithmically' in
software.
1010 x
0111
1010 replicate the multiplicand if there is a '1' in the multiplier, otherwise 0
10100 shift left one place
101000 shift left two places
0000000
1000110 check: 0100 01102 = 4616 = 4x16 + 6 = 70
Division
Division is more complicated, and nearly always done algorithmically. The simple (but inefficient) version uses
repeated subtraction of the divisor from the dividend, counting the number of subtractions needed until the result
becomes 0.
In a fixed-length word, one bit is reserved for the sign. For For N-bit word length, -A is represented as 2N - A.
example in an 8-bit system, using the MSB for the sign:
The most significant bit (MSB) becomes 1 for a -ve
0111 1111 = +127 number, 0 for +ve.
1111 1111 = -127
Consider word length 8 bits:
The first problem here is that there are two ways of
representing zero: 1000 0000 and 0000 0000. +62 = 0011 1110
-62 = 28-62 = 256-62 = 194 = 1100 0010
Another awkwardness is that simple arithmetic addition of
the representations of the positive and negative forms of a In practice, the two's complement can be obtained by
number does not generate zero, e.g. complementing each bit and adding 1.
+0000 1000
From this, work out a general rule to find the range of Both numbers negative 1111 1101
Exercise
two's complement numbers using N bits. • Disregard final carry. Sum is - +1111 1111
ve
What is the range that can be represented using 8 bits?
• Warning: overflow could occur
But, if we're working in 8 bit 2's complement, 8516 will be 2's complement processing, provided overflow does not
interpreted as - what? occur, correctly computes the addition of all possible
combinations of +ve and -ve numbers, so specialised
Two negative numbers: hardware to perform subtraction is not needed in a
computer system. An adder and means to complement
1000 1001 8916 = -119 (2's C. is 77 =16119) binary numbers - both easy operations - are sufficient.
1011 0001 +B116 = -79 (2's C is 4F16 = 79)
(1)0011 1010 3A = +58 Wrong! (ignore carry)
16
P=AX Q=A/ B
B
'Shift and add' multiplication cannot be done directly Check sign bits of A and B
with signed numbers. The algorithm is: if different, set a 'flag' bit to indicate result will be -ve.
if same, result will be +ve. Flag is not set.
Check sign bits of A and B
if different, set a 'flag' bit to indicate product will be - Make A positive and B negative (2's complement)
ve. if same, product will be +ve. Flag bit is not set.
Initialise quotient (Q) to 0
Make A and B both
positive Initialise partial remainder to A
Perform shift-and-add multiplication as before.
loop:
If flag bit set, make product negative by taking 1. Subtract B from partial remainder using
2's complement 2's complement addition of (-B).
Done. 2. Add 1 to Q
Note that overflow may occur: what is the potential length 3. if partial remainder is zero or negative exit loop
in bits of the result if you multiply two N bit binary otherwise repeat from loop.
numbers?
Signed and Unsigned variables We want the sign of the longer version of the variable to
be the same as that of the original; this is done by a
In the C programming language, integers can be of five process called sign extension.
basic types which vary in size, depending on the
processor, in units of 1 byte (or 8 bits). This program and Suppose we have the numbers +63 and -120 in 8-bit 2's
its output list the sizes on a PC running Linux: complement char variables. In binary:
Teletype close-up
The units at left are paper tape punch (top) and paper tape reader.
Punched paper tape was sometimes the only backup medium for programs
and data. The TTY printed at 10 characters/second, and could read tapes
at the same speed. 'High speed' (300 characters/second) paper tape
readers and punches were also available, but as separate units.
Note that each character occupies a row of 7 bits plus a parity bit. EVEN
parity is being used here. The large holes are data, a hole representing
binary '1', and no hole '0'. The small holes engage with the teeth of the drive
sprocket.
Error checking - parity This system fails, of course, if more than one bit is in error,
but if the error probability is moderately low, then it is quite
The 7-bit ASCII code sometimes incorporates a simple effective.
error checking process which uses the spare bit of each
byte. Error checking is needed on 'noisy' data channels Assuming the errors are random, and that they occur with
(e.g. radio), when bits may randomly change from 1 to 0 probability p, then the probability of a bit being received
or 0 to 1. Using the 8th bit as a parity bit on serial correctly is (1-p).
communication ports is an option that can be selected.
In an n-bit word, the probability of exactly one error is:
Parity can be 'Odd' or 'Even', meaning that the number
of 1s in each byte, including the parity bit, is odd or even. np(1-p)n-1 (1)
If the receiver detects the wrong parity, then an error is
likely. For example, consider 4 bit data words: The error could occur in n places, while the probability of
the remaining (n-1) bits being correct is (1-p)n-1.
Data Odd parity Even parity
0000 1 0 The probability of exactly 2 errors is:
0001 0 1
0010 0 1
0011 1 0 n
C 2.p (1-p)
2 n-2
(2)
0100 0 1
0101 1 0 As an example, let p = 10-4, a very high error rate in
0110 1 0 practice, with 8 bit data.
0111 0 1
1000 0 1
From eqn (1), P(exactly 1 error) = 7.9944x10-4 ( 8 x 10-4).
1001 1 0
1010 1 0
1011 0 1 From eqn (2), P(exactly 2 errors) =
1100 1 0
8
C 2.10-8 (0.9999) 6 = 28x10 -8 x 0.999400 = 2.798 x 10-7
1101 0 1
1110 0 1
1111 1 0 This is a huge reduction, so it's likely that the parity
system will work most of the time.
In the late 1940’s Richard Hamming recognized that the Let us consider the bit sequence (I4, I3, I2, I1)= (1, 1, 0, 1).
further evolution of computers required greater reliability, in We compute the parity bits as follows:
particular the ability to not only detect errors, but correct
them. His search for error-correcting codes led to the C3 = 0 + 1 + 1 = 0
Hamming Codes, perfect 1-error correcting codes, and the
C2 = 1 + 1 + 1 = 1
extended Hamming Codes, 1-error correcting and 2-error
C1 = 1 + 0 + 1 = 0
detecting codes.
Hamming Codes are still widely used in computing, Thus, the Hamming encoded code word is
telecommunication, and other applications. Hamming (I4, I3, I2, C3, I1, C2, C1) = (1, 1, 0, 0, 1, 1, 0)
Codes are also applied in
• Data compression Suppose that the code word is corrupted during storage
• Some solutions to the popular puzzle ‘The Hat Game’ and I3 value is switched from 1 to 0. The resulting word is
• Block Turbo Codes (I4, I3, I2, C3, I1, C2, C1) = (1, 0, 0, 0, 1, 1, 0)
Let’s consider a sequence of bits (I 4, I3, I2, I1). The binary
If we calculate the parity symbols again we have
(7,4) Hamming code adds three parity symbols for error
detection and correction. The parity bits are placed at the
C3 ’ = 0 + 0 + 1 = 1
20 = 1, 21 = 2, and 22 = 4 bit positions. It encodes the
original sequence of bits to a new (I 4, I3, I2, C3, I1, C2, C1), C2 ’ = 1 + 0 + 1 = 0
where C1, C2, C3 correspond to the parity bits. This can be C1 ’ = 1 + 0 + 1 = 0
computed as
We can see that C3’, C2’, C1’ is different from C3, C2, C1.
C3 = I2 + I3 + I4, modulo-2 addition Thus, there is an error. We can find its position!!!
C2 = I 1 + I 3 + I 4 This is done by performing XOR of the old and the new
C1 = I 1 + I 2 + I 4 bits, i.e., ( C3’ + C3, C2’ + C2, C1’ + C1 ) = (1, 1, 0) (modulo-2
addition)
The (7,4) Hamming code is able to correct a single bit error
and detect up to two bit errors. Thus, the error happened in the sixth bit, which is correct!
School of Computer Science and Electronic Engineering University of Essex
N. Thomos CE162 Digital Electronic Systems 1.20
Binary Octal Dec Hex Character Binary Octal Dec Hex Character
Exercises “0110”,
“0111”,
1 For completeness, and as an example of simple C “1000”,
language programming, here is the program that “1001”,
generated most of the ASCII code listing on the previous “1010”,
“1011”,
pages. How does it work?
“1100”,
“1101”,
2 By examining the binary for a character and its shifted “1110”,
version (e.g. a and A, 1 and !), work out what functions “1111”};
are performed by the SHIFT key. Do the same for CTRL.
main(){
3 What letter keys in combination with CTRL produce CR int cnt;
('carriage return', from teleprinter usage) and LF (line unsigned char code;
feed)? Because they are used frequently, on a keyboard
these have their own dedicated keys as well, as do HT
(horizontal tab) and DELETE. code = ‘ ‘;
cnt = 0x7f - ‘ ‘;
/*
printf (“\n\nBinary\t\tOctal\tDec\tHex\
Program to list printable ASCII characters in binary,
tCharacter\n”);
octal, decimal and hexadecimal
*/
while (cnt--){
/* printf (“%s %s\t%03o\t%3d\t%2x\t%c\n”,
binstrings[(code & 0xf0) >> 4],
Array containing character strings representing the
binstrings[code & 0xf],
binary conversions of 0 - 0xF.
code, code, code, code);
(The printf() procedure doesn't have a 'binary' output
code++;
format!)
}
*/
exit(0);
char *binstrings[] = {
}
“0000”,
“0001”,
“0010”,
“0011”,
“0100”,
“0101”,
This slide shows the front (top) and rear (bottom) of a small three-layer
backplane unit for the VME Bus, commonly used in industrial
applications. This version uses 3x64 pin connectors. Thin lines carry
signals, thick ones power and ground.
Closeup of a databus on a PC
backplane
Interfacing to databuses
The standard way of driving a databus uses tristate logic
devices (right). As the name implies, these have three
output conditions:
Enabled:
Data Register
The function of a data register is to act as a temporary store for information. It does this by capturing the state (logic
high or low) of each input line in parallel and simultaneously onto individual binary storage elements called 'flip-flops'.
Most registers are 'edge triggered' (positive or negative), which means that the capture takes place on a transition
(that is, a change from low-to-high or high-to-low) of the strobe or clock pulse input. The state of the input line the
instant before the transition is the one that matters; at other times the data inputs are ignored.
The example device block diagram (previous page) shows the 74LS374 octal register. This device does not have a
'clear' input, which forces all outputs to zero when activated, but many do.
To get word lengths longer than 8 bits, additional registers are added as required, but driven from the same clock/
enable/clear inputs.
Exercise
Draw the output waveform from O0 if the input, D0 , and clock waveform are as shown. The positive edge (arrows) is the active one
Data
Input, D0
Clock/
strobe
Output,
O0?
Time
Receiving RS232 serial data To receive serial data reliably also requires a clock or
The RS232 standard defines a number of 'Baud rates', strobe waveform, ideally synchronised to the middle of the
formally defined as 'symbols per second'*, but here it is the input bit period. Why is this?
same as bits per second. Some standard Baud rates are:
Because the receive clock is reset by each start bit, so
75 as long as the transmitter and the receiver agree on the
110 Baud rate, and number of bits per character, then its
300 accuracy does not have to be very great: provided drift is
1200 less than, say, 25% of the bit period in 10 bits, then
2400 performance will be satisfactory. In practice, a very stable
9600 crystal oscillator-driven Baud rate generator will be used.
19200
38400 Why 'asynchronous'?
and so on. Baud rates 75 and 110 were used with The reason is, as above, that the clocks at transmitter and
receiver do not have to be precisely locked in frequency
mechanical Teletypes and similar devices.
and phase, as would be the case in a synchronous
system. In synchronous systems, it is usually not
Converting to serial form can be done with a 10- bit
convenient to provide a separate path for data and clock,
parallel-in, serial-out (PISO) shift register, assuming 1
so the receive clock has to be recovered in some way from
start bit, 1 stop bit and 8 data bits. The bits corresponding
the data sequence itself. How this is done cannot be
to the start and stop bits would be hardwired to logic 0 and
covered in detail here, but it involves knowing the precise
1 respectively. Similarly, receiving serial data uses a serial-
clock frequency and that the signal transitions on the
in, parallel-out (SIPO) shift register to reassemble each
incoming data stream happen at integer multiples of the
byte.
clock period. By ensuring that there is a regular 'supply' of
such transitions (i.e., you need to avoid transmitting long
*Serial digital data transmission often uses more than two levels for strings of 0’s like 00000000...!), then clock recovery can be
signalling. For example, with 4 discrete levels, each symbol carries two
bits of information. If the Baud rate is N symbols per second, then the done quite easily. The next problem is to detect the
data rate for 4 levels would be 2N bits per second. 'framing' of the incoming data, i.e., knowing where is the
start of each byte.
What is the bit rate for 8 level signalling?
School of Computer Science and Electronic Engineering University of Essex
N. Thomos CE162 Digital Electronic Systems 1.29
The Linux utility application KHexEdit gives the raw data in hex on the left, and an attempted ASCII interpretation on the
right. 'Unprintable' characters, including such as CR and LF, come out as '.'. (You can also use this to edit files at the
binary level; for example it would be possible to change the character strings in the executable file.)
Here is the dialogue when 'saw.wav', an audio file, is listed -A base Specify the input address base. Base may be one of d, o, x
or n, which specify decimal, octal, hexadecimal addresses or
with od set to display ASCII characters (when it can): no address, respectively.
play back an audio file needs to know, for example, what -i Output signed decimal ints. Equivalent to -t dI.
the sampling rate is, how many channels there are, and -j skip Skip skip bytes of the combined input before dumping. The
so on. Examining the 'official' format description would number may be followed by one of b, k or m which specify the
units of the number as blocks (512 bytes), kilobytes and
enable anyone to write a program to interpret the data megabytes, respectively.
correctly, and for example do some signal processing, -N length Dump at most length bytes of input.
then output the modified file using the same format. -O Output octal ints. Equivalent to -t o4.
Here is another listing, this time 8192 bytes into the saw.wav file, and well inside the 'sound' part, now interpreting the
bytes as 'signed short integers' (i.e. 16 bits).
The data (the sequence runs left to right along each row) seem to be periodic in that the sign changes at regular
intervals, but the values still look suspicious. Why? (The leftmost column gives the byte number within the file.)
eseimac1:~ timjdennis$ od -A d -j 8k -N 512 -s GEOPHYSICS/SOUND/saw.wav
0008192 256 256 256 256 0 0 0 0
0008208 -1 -1 -1 -1 -257 -257 -257 -257
0008224 -257 -257 -513 -513 -513 -513 -769 -769
0008240 -769 -769 -1025 -1025 -1025 -1025 -1025 -1025
0008256 -1281 -1281 -1281 -1281 -1281 -1281 -1537 -1537
0008272 -1537 -1537 -1537 -1537 -1793 -1793 -1793 -1793
0008288 -1793 -1793 -1793 -1793 -1793 -1793 -1793 -1793
0008304 -2049 -2049 -2049 -2049 -2049 -2049 -2049 -2049
0008320 -2049 -2049 -2049 -2049 -1793 -1793 -1793 -1793
0008336 -1793 -1793 -1793 -1793 -1793 -1793 -1793 -1793
0008352 -1537 -1537 -1537 -1537 -1537 -1537 -1537 -1537
0008368 -1281 -1281 -1281 -1281 -1281 -1281 -1025 -1025
0008384 -1025 -1025 -769 -769 -769 -769 -513 -513
0008400 -513 -513 -257 -257 -257 -257 -1 -1
0008416 -1 -1 0 0 0 0 256 256
0008432 256 256 512 512 512 512 768 768
0008448 768 768 1024 1024 1024 1024 1280 1280
0008464 1280 1280 1536 1536 1536 1536 1536 1536
0008480 1792 1792 1792 1792 1792 1792 2048 2048
0008496 2048 2048 2048 2048 2304 2304 2304 2304
0008512 2304 2304 2304 2304 2304 2304 2304 2304
0008528 2560 2560 2560 2560 2560 2560 2560 2560
0008544 2560 2560 2304 2304 2304 2304 2304 2304
0008560 2304 2304 2304 2304 2304 2304 2048 2048
0008576 2048 2048 2048 2048 1792 1792 1792 1792
0008592 1792 1792 1536 1536 1536 1536 1280 1280
0008608 1280 1280 1024 1024 1024 1024 768 768
0008624 768 768 512 512 512 512 256 256
0008640 256 256 0 0 -1 -1 -1 -1
0008656 -257 -257 -257 -257 -513 -513 -513 -513
0008672 -769 -769 -1025 -1025 -1025 -1025 -1281 -1281
0008688 -1281 -1281 -1537 -1537 -1537 -1537 -1537 -1537
The answer depends on the actual (hardware) processor Here is the same part of the file, but interpreted by another
that is running the machine’s operating system. In this listing program that swaps pairs of bytes. Now the values
case, used to construct this example, it is an Apple iMac make more sense.
running OS X 10.4, but more crucially, the processor is a 8192 1 8244 -5 8296 -8 8348 -8
Motorola Power PC G5. This uses the 'Big Endian' internal 8194 1 8246 -5 8298 -8 8350 -8
byte order, whereas Intel processors used by Microsoft 8196 1 8248 -5 8300 -8 8352 -7
Windows machines (and later versions of the iMac) use 8198 1 8250 -5 8302 -8 8354 -7
Little Endian, as mentioned previously. 8200 0 8252 -5 8304 -9 8356 -7
8202 0 8254 -5 8306 -9 8358 -7
On the PPC, the byte order in a file or memory for a 32-bit 8204 0 8256 -6 8308 -9 8360 -7
integer is the most significant byte first and the least 8206 0 8258 -6 8310 -9 8362 -7
significant byte last. Intel machines are the other way 8208 -1 8260 -6 8312 -9 8364 -7
around, i.e., the least significant byte comes first. 8210 -1 8262 -6 8314 -9 8366 -7
8212 -1 8264 -6 8316 -9 8368 -6
Hence, the 32-bit hexadecimal number 0102030416 would 8214 -1 8266 -6 8318 -9 8370 -6
appear in the file system of a PPC-based machine as the 8216 -2 8268 -7 8320 -9 8372 -6
sequence of four hex bytes: 8218 -2 8270 -7 8322 -9 8374 -6
8220 -2 8272 -7 8324 -9 8376 -6
01 02 03 04 8222 -2 8274 -7 8326 -9 8378 -6
8224 -2 8276 -7 8328 -8 8380 -5
On Intel-based machines (Windows, and Macs using the 8226 -2 8278 -7 8330 -8 8382 -5
Intel Core Duo processor) the storage format would be: 8228 -3 8280 -8 8332 -8 8384 -5
8230 -3 8282 -8 8334 -8 8386 -5
04 03 02 01
8232 -3 8284 -8 8336 -8 8388 -4
8234 -3 8286 -8 8338 -8 8390 -4
Clearly, if the byte order of data from a file is wrongly
8236 -4 8288 -8 8340 -8 8392 -4
interpreted, it potentially makes a very big difference to the
8238 -4 8290 -8 8342 -8 8394 -4
numerical values obtained.
8240 -4 8292 -8 8344 -8 8396 -3
8242 -4 8294 -8 8346 -8 8398 -3
School of Computer Science and Electronic Engineering University of Essex
N. Thomos CE162 Digital Electronic Systems 1.34
Conversion routines
main(argc, argv)
int argc;
1. ASCII to Binary char **argv;
{
unsigned char c, *cp;
Within a processor, numerical data are represented in int value;
binary but have to be entered or displayed using the ASCII int base, digit_value,
code. This C program for Unix takes its arguments, cnt, index;
/*
converts them from ASCII strings to binary, and then Convert argument list
outputs them again in octal, decimal, and hexadecimal. It from ascii to integer.
recognises the standard '0...' and '0x...' for octal and Accept octal (start with 0, 0-7), hex
hexadecimal and assumes decimal otherwise. (start with 0x, 0-9, A-F) or decimal, 0-9.
*/
cnt = argc - 1;
Conversion stops when a character is read that is NOT a index = 1; while
valid digit in the current base. (cnt—){
cp = *++argv; /* pointer to arg. string */ if
(cp[0] == ‘0’){
The central part of the algorithm, in 'pseudocode', is: if (cp[1] == ‘x’) {
base = 16;
value = 0; cp += 2;
}
loop { else base = 8;
get_character_f }
rom_input; else value
base == 10;
0;
if (character is valid in current base){ while(1){
c =
*cp++;
value = value * base; if
convert_character_to_digit_value; (base
== 8){
value = value + digit_value; i
} f
else exit_loop; (
} !
(
(
School of Computer Science and Electronic Engineering University of Essex
c
>
N. Thomos CE162 Digital Electronic Systems 1.35
if (base == 16){
if ((c >= ‘A’) && (c <= ‘F’)) digit_value = c - ‘A’ + 10;
else digit_value = c - ‘0’;
}
else digit_value = c - ‘0’;
value = value * base digit_value;
+
printf
} (“\nArgument %d value: 0%o, %d, 0x%x\n”, index, value, value, value);
index++;
}
exit(0);
}
[tim2@james DEMO_PROGS]$
Computer word lengths are finite, which means that the 299792500 m.s-1,
range of numerical quantities that can be represented
exactly is also finite and makes handling very large or very but because of the measurement error, the last digit is
small numbers difficult. It is possible to do arithmetic on effectively 'noise' and hence adds no useful information;
words of any length by combining the units provided thus, it is best omitted. For most practical purposes, this
(e.g., the 'long long' integer type in C), but is the precision many digits of precision are unnecessary, and it's common
this gives always necessary? to see the speed of light given as 'approximately
3 x 108 m.s-1'.
We already have a way of dealing with this problem in
everyday applications, using floating point, also known as A form of this notation often used in scientific programming
'scientific' representation, which is to specify a small would give the number representing the speed of light as:
decimal number (integer part ideally in the range 1 to 9)
followed by a multiplier in the form of a positive or negative 2.9979250E8
integer power of ten. For example, the rest mass of the
electron from a table of physical constants is listed as: and the mass of the electron:
Decimal Floating Point arithmetic Addition and subtraction are less straightforward
because the numbers have to be denormalised so that
both have the same exponent before the mantissas can
Multiplication
be processed. This is also called 'aligning the decimal
point'.
Example:
E.g. for addition:
(9x105) x (2x108)
(9x105) + (2x108)
= (9 x 2) x (10 x 10 ) 5 8
= (9/2) x (105/108)
= 4.5 x 105 x 10-8
= 4.5 x 10(5-8)
= 4.5 x 10-3
There is no 'natural' way of doing this; as a result a number of methods exist. A well-known recommended standard is
that defined by ANSI/IEEE 754-1985.
This has two main versions, single in 32 bits = 4 bytes, and double precision, which uses 64 bits = 8 bytes.
The bits in the IEEE single precision 32-bit floating point standard representation may be numbered from 0 to 31, left to
right*. The first bit is the sign bit, S, the next eight bits are the exponent bits, 'E', and the final 23 bits are the fraction 'F':
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF
0 1 8 9
31
If 0<E<255 then V= (-1)S x 2(E-127) x (1.F) where "1.F" is intended to represent the binary number created by
prefixing F with an implicit leading 1 and a binary point.
If E=0 and F is nonzero, then V = (-1)S x 2(-126) x (0.F). These are "unnormalized" values.
If E=0 and F is zero and S is 1, then V = -0
If E=0 and F is zero and S is 0, then V = 0
*This
School of can beScience
Computer confusing,
and Electronicgiven thatUniversity
Engineering it's moreof Essex usual to number bits right to left, reflecting their 'weight'
within a number.
N. Thomos CE162 Digital Electronic Systems 1.40
Examples
0 00000000 00000000000000000000000 = 0
1 00000000 00000000000000000000000 = -0
0 11111111 00000000000000000000000 = Infinity
1 11111111 00000000000000000000000 = -Infinity
Note that for most of the range, the precision of the mantissa is actually 24 bits, even though it is allocated 23 bits in the
standard. This is because the fractional part is always of the form 1.xxxxx..., so the leading '1' becomes redundant and
can be left out, but must be restored when calculations are actually performed.
Floating point arithmetic can be done in software, in which case it is very much slower than integer. More usually,
specialised hardware is provided and the speed disadvantage may be slight or non-existent.
Examples 2. Convert
0 1000 0000 1001 0010 0001 1111 1011 011
1. Convert -1.5 to single precision floating point to decimal.
representation.
The exponent is 128 - 127 = +1.
1.5 = 1 + 2-1 = 1.1
2
The mantissa is:
Hence the exponent is 0. Excess 127 representation is 1.1001 0010 0001 1111 1011 0112
used in IEEE 754, so we add 127, hence E = 0111 1111
= 1100 1001 0000 1111 1101 1011 x -23
2
The mantissa is 1.1 2, but the integer part is 'hidden', so F = C90FDB16 x 2 -23
becomes: = 131769510 x 2 -23= 1.57079637110
1000 0000 0000 0000 0000 000 The exponent is +1, so the number is finally:
The sign bit, S, is 1 since the number is negative, so the 1.57079637110 x 21 = 3.141592741
full result is:
Looks a bit like , but the last 3 digits are wrong!
1 0111 1111 1000 0000 0000 0000 0000
000
Exercises
Why?
Run the program:
#include <stdio.h>
main() eseimac1:~/DEMO_PROGS timjdennis$ rounding_error
{ sizeof float: 4, double: 8, long long: 8
float sum, inc; Calculating 10^6, 'float' variables
int cnt;
unsigned long Bit pattern, float: 0x3dcccccd00000000
long ltmp = 0;
Repeated addition. sum = 1087937.000000000
printf ("sizeof float: %d, double: %d, long long: %d\n",
sizeof(float), sizeof(double), sizeof(long long)); Multiplication. sum = 1000000.000000000
The error does occur with double precision floating point variables, but would with proportionately smaller/larger
values.
General advice in programs where a variable is needed that changes by a large number of very small steps is NOT to
use code like this:
inc = very_small_number;
loop_control = 0;
stepping_value = 0.0;
while (loop_control < large_number)
{ stepping_value = stepping_value +
inc; loop_control = loop_control + 1;
...(rest of loop)
}
Single precision numbers should be avoided; they are really only useful for storing raw data, or processed final data.
Use double precision format (which uses 8 bytes/64 bits) within programs.
School of Computer Science and Electronic Engineering University of Essex
N. Thomos CE162 Digital Electronic Systems 1.44
Summary
In this section we have covered:
• Number systems, and conversions between different bases: binary, octal, decimal
and hexadecimal.
• Binary fractions
• Binary arithmetic
• Use of Two's Complement for negative numbers; two's complement arithmetic.
• Data representation in computer systems: 'text' and 'data' files.
• The ASCII code. Serial transmission of binary data on the RS232 standard.
• Floating point binary number representation and floating point arithmetic.