0% found this document useful (0 votes)
13 views5 pages

06 Record Types

The document discusses data types, focusing on associative arrays and records in programming languages. It explains how associative arrays, such as Lua's tables and Perl's optimized implementations, allow for efficient data storage and retrieval, while records provide a structured way to group heterogeneous data elements. The document also covers the differences between arrays and records, their implementation, and the use of tuples as a related data type.

Uploaded by

prow8273
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

06 Record Types

The document discusses data types, focusing on associative arrays and records in programming languages. It explains how associative arrays, such as Lua's tables and Perl's optimized implementations, allow for efficient data storage and retrieval, while records provide a structured way to group heterogeneous data elements. The document also covers the differences between arrays and records, their implementation, and the use of tuples as a related data type.

Uploaded by

prow8273
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

276 Chapter 6 Data Types

hashed access to elements. An array can have elements that are created with
simple numeric indices and elements that are created with string hash keys.
In Lua, the table type is the only data structure. A Lua table is an associa-
tive array in which both the keys and the values can be any type. A table can be
used as a traditional array, an associative array, or a record (struct). When used
as a traditional array or an associative array, brackets are used around the keys.
When used as a record, the keys are the field names and references to fields can
use dot notation (record_name.field_name).
The use of Lua’s associative arrays as records is discussed in Section 6.7.
C# and F# support associative arrays through a .NET class.
An associative array is much better than an array if searches of the elements
are required, because the implicit hashing operation used to access elements
is very efficient. Furthermore, associative arrays are ideal when the data to be
stored is paired, as with employee names and their salaries. On the other hand,
if every element of a list must be processed, it is more efficient to use an array.

6.6.2 Implementing Associative Arrays


The implementation of Perl’s associative arrays is optimized for fast lookups,
but it also provides relatively fast reorganization when array growth requires
it. A 32-bit hash value is computed for each entry and is stored with the entry,
although an associative array initially uses only a small part of the hash value.
When an associative array must be expanded beyond its initial size, the hash
function need not be changed; rather, more bits of the hash value are used.
Only half of the entries must be moved when this happens. So, although expan-
sion of an associative array is not free, it is not as costly as might be expected.
The elements in PHP’s arrays are placed in memory through a hash func-
tion. However, all elements are linked together in the order in which they were
created. The links are used to support iterative access to elements through the
current and next functions.

6.7 Record Types


A record is an aggregate of data elements in which the individual elements
are identified by names and accessed through offsets from the beginning of
the structure.
There is frequently a need in programs to model a collection of data in
which the individual elements are not of the same type or size. For example,
information about a college student might include name, student number,
grade point average, and so forth. A data type for such a collection might use
a character string for the name, an integer for the student number, a floating-
point for the grade point average, and so forth. Records are designed for this
kind of need.
It may appear that records and heterogeneous arrays are the same, but that
is not the case. The elements of a heterogeneous array are all references to data
6.7 Record Types 277

objects that reside in scattered locations, often on the heap. The elements of a
record are of potentially different sizes and reside in adjacent memory locations.
Records have been part of all of the most popular programming languages,
except pre-90 versions of Fortran, since the early 1960s, when they were intro-
duced by COBOL. In some languages that support object-oriented program-
ming, data classes serve as records.
In C, C++, and C#, records are supported with the struct data type. In
C++, structures are a minor variation on classes. In C#, structs are also related
to classes, but are also quite different. C# structs are stack-allocated value types,
as opposed to class objects, which are heap-allocated reference types. Structs
in C++ and C# are normally used as encapsulation structures, rather than data
structures. They are further discussed in this capacity in Chapter 11.Structs are
also included in ML and F#.
In Python and Ruby, records can be implemented as hashes, which them-
selves can be elements of arrays.
The following sections describe how records are declared or defined,
how references to fields within records are made, and the common record
operations.
The following design issues are specific to records:
• What is the syntactic form of references to fields?
• Are elliptical references allowed?

6.7.1 Definitions of Records


The fundamental difference between a record and an array is that record ele-
ments, or fields, are not referenced by indices. Instead, the fields are named
with identifiers, and references to the fields are made using these identifiers.
Another difference between arrays and records is that records in some lan-
guages are allowed to include unions, which are discussed in Section 6.10.
The COBOL form of a record declaration, which is part of the data
division of a COBOL program, is illustrated in the following example:

01 EMPLOYEE-RECORD.
02 EMPLOYEE-NAME.
05 FIRST PICTURE IS X(20).
05 MIDDLE PICTURE IS X(10).
05 LAST PICTURE IS X(20).
02 HOURLY-RATE PICTURE IS 99V99.

The EMPLOYEE-RECORD record consists of the EMPLOYEE-NAME record and


the HOURLY-RATE field. The numerals 01, 02, and 05 that begin the lines of
the record declaration are level numbers, which indicate by their relative values
the hierarchical structure of the record. Any line that is followed by a line with
a higher-level number is itself a record. The PICTURE clauses show the formats
of the field storage locations, with X(20) specifying 20 alphanumeric characters
and 99V99 specifying four decimal digits with the decimal point in the middle.
278 Chapter 6 Data Types

Ada uses a different syntax for records; rather than using the level numbers
of COBOL, record structures are indicated in an orthogonal way by simply
nesting record declarations inside record declarations. In Ada, records cannot be
anonymous—they must be named types. Consider the following Ada declaration:

type Employee_Name_Type is record


First : String (1..20);
Middle : String (1..10);
Last : String (1..20);
end record;
type Employee_Record_Type is record
Employee_Name: Employee_Name_Type;
Hourly_Rate: Float;
end record;
Employee_Record: Employee_Record_Type;

In Java and C#, records can be defined as data classes, with nested records
defined as nested classes. Data members of such classes serve as the record fields.
As stated previously, Lua’s associative arrays can be conveniently used as
records. For example, consider the following declaration:

employee.name = "Freddie"
employee.hourlyRate = 13.20

These assignment statements create a table (record) named employee with


two elements (fields) named name and hourlyRate, both initialized.

6.7.2 References to Record Fields


References to the individual fields of records are syntactically specified by sev-
eral different methods, two of which name the desired field and its enclosing
records. COBOL field references have the form
field_name OF record_name_1 OF . . . OF record_name_n
where the first record named is the smallest or innermost record that contains
the field. The next record name in the sequence is that of the record that con-
tains the previous record, and so forth. For example, the MIDDLE field in the
COBOL record example above can be referenced with

MIDDLE OF EMPLOYEE-NAME OF EMPLOYEE-RECORD

Most of the other languages use dot notation for field references, where
the components of the reference are connected with periods. Names in dot
notation have the opposite order of COBOL references: They use the name
of the largest enclosing record first and the field name last. For example, the
following is a reference to the field Middle in the earlier Ada record example:

Employee_Record.Employee_Name.Middle
6.7 Record Types 279

C and C++ use this same syntax for referencing the members of their
structures.
References to elements in a Lua table can appear in the syntax of record
field references, as seen in the assignment statements in Section 6.7.1. Such
references could also have the form of normal table elements—for example,
employee["name"].
A fully qualified reference to a record field is one in which all intermedi-
ate record names, from the largest enclosing record to the specific field, are
named in the reference. Both the COBOL and the Ada example field refer-
ences above are fully qualified. As an alternative to fully qualified references,
COBOL allows elliptical references to record fields. In an elliptical reference,
the field is named, but any or all of the enclosing record names can be omitted,
as long as the resulting reference is unambiguous in the referencing environ-
ment. For example, FIRST, FIRST OF EMPLOYEE-NAME, and FIRST OF
EMPLOYEE-RECORD are elliptical references to the employee’s first name in the
COBOL record declared above. Although elliptical references are a program-
mer convenience, they require a compiler to have elaborate data structures and
procedures in order to correctly identify the referenced field. They are also
somewhat detrimental to readability.

6.7.3 Evaluation
Records are frequently valuable data types in programming languages. The
design of record types is straightforward, and their use is safe.
Records and arrays are closely related structural forms, and it is therefore
interesting to compare them. Arrays are used when all the data values have the
same type and/or are processed in the same way. This processing is easily done
when there is a systematic way of sequencing through the structure. Such process-
ing is well supported by using dynamic subscripting as the addressing method.
Records are used when the collection of data values is heterogeneous and
the different fields are not processed in the same way. Also, the fields of a record
often need not be processed in a particular order. Field names are like literal, or
constant, subscripts. Because they are static, they provide very efficient access
to the fields. Dynamic subscripts could be used to access record fields, but it
would disallow type checking and would also be slower.
Records and arrays represent thoughtful and efficient methods of fulfilling
two separate but related applications of data structures.

6.7.4 Implementation of Record Types


The fields of records are stored in adjacent memory locations. But because
the sizes of the fields are not necessarily the same, the access method used for
arrays is not used for records. Instead, the offset address, relative to the begin-
ning of the record, is associated with each field. Field accesses are all handled
using these offsets. The compile-time descriptor for a record has the general
form shown in Figure 6.7. Run-time descriptors for records are unnecessary.
280 Chapter 6 Data Types

Figure 6.7
Record
A compile-time
Name
descriptor for a record
Field 1 Type

Offset

Name

Field n Type

Offset

Address

6.8 Tuple Types


A tuple is a data type that is similar to a record, except that the elements are
not named.
Python includes an immutable tuple type. If a tuple needs to be changed, it
can be converted to an array with the list function. After the change, it can be
converted back to a tuple with the tuple function. One use of tuples is when
an array must be write protected, such as when it is sent as a parameter to an
external function and the user does not want the function to be able to modify
the parameter.
Python’s tuples are closely related to its lists, except that tuples are
immutable. A tuple is created by assigning a tuple literal, as in the following
example:

myTuple = (3, 5.8, 'apple')

Notice that the elements of a tuple need not be of the same type.
The elements of a tuple can be referenced with indexing in brackets, as in
the following:

myTuple[1]

This references the first element of the tuple, because tuple indexing begins at 1.
Tuples can be catenated with the plus (+) operator. They can be deleted
with the del statement. There are also other operators and functions that
operate on tuples.
ML includes a tuple data type. An ML tuple must have at least two ele-
ments, whereas Python’s tuples can be empty or contain one element. As in

You might also like