Data Structure in Mainframe
Data Structure in Mainframe
Data Structure in Mainframe
- Rajesh K Gupta
Bits and Bytes explained
The computer memory consists of a series of switches known as "Bits" in either the
ON or the OFF position. A combination of eight switches is called a "Byte", and one
byte is used for each character on your keyboard.
Data on IBM midrange (AS/400 and iSeries) and IBM compatible mainframe systems
is transmitted using 8-bit Extended Binary Coded Decimal Interchange Code
(EBCDIC). In contrast, UNIX and PC operating systems transmit data using 7-bit
American Standard Code for Information Interchange (ASCII). Because of this,
alphanumeric and special characters on IBM AS/400 and mainframe systems have
different binary representations to those on UNIX and PC systems.
The letter A, for instance, is stored in AS/400 and IBM compatible mainframe
computer memory as 1100 0001, and in UNIX and PC computer memory as 0100
0001 (where 1 represents an 'on' bit, and 0 represents an 'off' bit).
It is useful to be able to refer to a byte of data, and know its bit configuration. But
using a string of 1's and 0's is laborious and leads to errors. For this purpose we use
Hexadecimal Representation.
Binary Explained
The following table shows the structure of a three digit numeric field using the
Binary format that is used on an IBM Mainframe System (i.e. the COBOL syntax
would be USAGE IS COMP). The field contains a value of one-hundred-twenty-
three (or 123). Since the binary format stores the number as an actual binary value
the field will only be two (2) bytes in length.
A binary field that is defined as "Unsigned" (i.e. PIC 999) is an implied positive value. A two (2)
byte unsigned, binary field may contain a range of implied positive values from 0 to 65,535.
A binary field that is defined as "Signed" (i.e. PIC S999) will use the high-order, leftmost bit as the
sign. A zero (0) is a positive sign and a one (1) is a negative sign. A two (2) byte signed, binary
field may contain a range of implied positive values from -32,767 to +32,767.
Hexadecimal explained
The first four bits of any byte are, broadly speaking, coded to distinguish
between numbers, letters and special characters. It is therefore convenient
to split the byte into two groups of four bits.
The highest possible value of a four digit binary number is 1111 which is
equivalent to 15 in base 10. This means, that to represent this number as
a single digit, we have to work in base 16.
Packed decimal is, in fact, the default numeric field type used by
many programs (e.g. SELCOPY) arithmetic functions.
To save this storage space we "pack" the data - that is we leave out
the coded bits - thus reducing the storage space needed by half.
Zoned Decimal representation uses a full byte for each numeric digit, but the
junior (right-most) byte is zoned according to the sign of the complete numeric
string, thus differentiating between positive and negative values.
As we saw above, the first four bits (left-most) of a byte are coded to signify a
numeric value by having all four bits either set to binary 1111 (X'F') for AS/400
and mainframe, or binary 0011 (X'3') for UNIX and PC.
These "first four bits" are known as the Zone portion of a byte, while the last 4
bits (right-most) are known as the Numeric portion of a byte.
Only the junior (right-most) byte of a zoned decimal string may be zoned to
indicate the sign, or to be more precise, it is only the junior byte which is used
for defining the sign. All other zones on other bytes in the string are ignored for
arithmetic purposes.
Zoned Decimal explained