0% found this document useful (0 votes)
241 views

Lab 05 - Data Types (Answers) PDF

The document discusses data types and programming language concepts. It provides review questions and explanations about: 1) Descriptors and how they are used for type checking and memory allocation/deallocation. 2) Ordinal, enumeration, and subrange types and their definitions. 3) The advantages of user-defined enumeration types in terms of readability and reliability. 4) The definitions and advantages of different types of arrays in terms of how their subscript ranges and storage allocation are handled.

Uploaded by

li
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
241 views

Lab 05 - Data Types (Answers) PDF

The document discusses data types and programming language concepts. It provides review questions and explanations about: 1) Descriptors and how they are used for type checking and memory allocation/deallocation. 2) Ordinal, enumeration, and subrange types and their definitions. 3) The advantages of user-defined enumeration types in terms of readability and reliability. 4) The definitions and advantages of different types of arrays in terms of how their subscript ranges and storage allocation are handled.

Uploaded by

li
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ITECH5403 Comparative Programming Languages

School of Science, Engineering and Information Technology

Lab 5 – Data Types


Introduction

As this is an 'odd' week, we'll work on some review questions and exercises relating to data types. In next weeks lab, we'll
be investigating the Python programming language.

As usual, I'll remind you that the exam for this course will consist of a selection of questions from each week's lab (80%)
and related questions which help to tie different areas of programming languages together (20%). If you can answer
these questions in the lab – you can answer them in the exam! Remember – in these labs you aren't expected to know
every single answer from memory – sometimes it's best to do a little research / reading / re-reading of materials and come
up with a great, and correct, answer rather than just 'taking a stab' at the question and hoping you're right!

Review Questions

1. What is a descriptor? [Q1]

A descriptor is the collection of attributes of a variable, which are stored in memory.

If the attributes are all static, descriptors are required only at compile-time - whereby the descriptors are built by the
compiler, usually as part of the symbol table, and are used during compilation.

For dynamic attributes, part or all of the descriptor must be maintained during execution. In this case, the descriptor is
used by the run-time system.

In all cases, descriptors are used for type checking and building the code for the allocation and deallocation operations.

2. Define ordinal, enumeration, and subrange types. [Q5]

An ordinal type is one in which the range of possible values can be easily associated with the set of positive integers. In
Java, for example, the primitive ordinal types are integer, char and boolean - i.e. these are the primitives which can be
worked with as if they are positive integer values - for example:

integer already is an integer, so we can use the range 0..Integer.MAX_VALUE,

char can be thought of as a positive integer in the range 0..255, and

boolean can be thought of as a positive integer in the range 0..1.

An enumeration type is one in which all of the possible values, which are all named constants, are provided (or
enumerated) in the definition. Enumeration types provide a way of defining and grouping collections of named constants
which are called enumerations constants or enums – for example:

enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};

A subrange type is a contiguous subsequence of an ordinal type. For example, 12..14 is a subrange of the integer ordinal
type. Subtypes are not new types - rather, they are new names for possibly restricted, or constrained, versions of existing
types - for example:

CRICOS Provider No. 00103D Insert file name here Page 1 of 8


type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun)

subtype Weekday is Days range Mon..Fri;

subtype Weekend is Days range Sat..Sun;

3. What are the advantages of user-defined enumeration types? [Q6]

Enumeration types can provide large advantages in terms of both readability and reliability.

Reliability is enhanced very directly, in that named values are easily recognized whereas coded values are not.

Further, in languages like Ada, C#, F# and Java 5+:

1. No arithmetic operations are legal on enumeration types, which prevents you from adding up days of the week
(which makes no sense) etc., and

2. No enumeration variable can be assigned a value outside its define range - that is, if you define Mon, Tues, …
Sun -- then you can set a Day enum to FakeDay - because no such enum exists in the set!

4. Define static, fixed stack-dynamic, stack-dynamic, fixed heap-dynamic, and


heap-dynamic arrays. What are the advantages of each? [Q9]

A static array is one in which the subscript ranges are statically bound and storage allocation is static (i.e. performed
before runtime). The advantage of static arrays is efficiency - no dynamic allocation or deallocation is required.

A fixed stack-dynamic array is one in which the subscript ranges are statically bound, but the allocation is done at
declaration elaboration time during execution. The advantage of fixed stack-dynamic arrays over static arrays is space
efficiency. A large array in one subprogram can use the same space as a large array in a different program, as long as
both subprograms are not active at the same time. The same is true if the two arrays are in different blocks that are not
active at the same time.

Stack-dynamic arrays are those in which both the subscript ranges and the storage allocation are dynamically bound at
elaboration time. Once the subscript ranges are bound and the storage is allocated, however, they remain fixed during
the lifetime of the variable. The advantage of stack-dynamic arrays over static and fixed stack-dynamic arrays is
flexibility, because the size of an array need not be known until the array is about to be used.

Fixed heap-dynamic arrays are similar to fixed stack-dynamic arrays in that the subscript ranges and the storage
binding both fixed after storage is allocation. The differences are that both the subscript ranges and storage binding are
done when the program requests them during executions, and the storage is allocated on the heap, rather than the
stack.The advantage of this is flexibility - the array's size always fits the problem.

A heap-dynamic array is one in which the binding of the subscript ranges and storage allocation is dynamic and can
change any number of times during the array's lifetime. The advantage of this is flexibility - arrays can grow and shrink
during program executions as required.

5. In terms of array storage, define row major order and column major order. In your answer explain
how this affects the layout of the array elements in memory. [Q17-Mod]

CRICOS Provider No. 00103D Page 2 of 8


In row-major arrays, values in consecutive columns are stored next to each other (i.e. the entire row is stored one value
after the next).

In column-major arrays, values in consecutive rows are stored next to each other in memory (i.e. the entire column is
stored one value after the next).

1 2 3

4 5 6

7 8 9

If we store this data in row-major format then the sequential values as stored in memory are: 1, 2, 3, 4, 5, 6, 7, 8, 9.

But if we store it in column-major format then the sequential values as stored in memory are: 1, 4, 7, 2, 5, 8, 3, 6, 9.

This mapping directly affects how we perform our 'pointer arithmetic' to calculate the memory address of a given array
element.

6. What is the primary difference between a tuple and a record? [Q23]

A record is an aggregate of data elements in which the individual elements are identified by names and accessed through
offsets from the beginning of the structure. For example, information about a Student might include a name (string),
studentNumber (int), dateOfBirth (date) etc. as named elements.

A tuple is a data type that is similar to a record, except that the elements are not named (i.e. there is no name, idNumber,
dateOfBirth names for the data - they are just a group of data values). To access an element of a tuple, we can just
specify its index, rather than its name (because the elements of a tuple are not specifically named!)

7. Explain what a union type is, and briefly discuss the design issues for them [Q31+32-Mod]

A union is a type whose variables may store different types of values at different times during program execution. For
example, consider a table of constants for a compiler (i.e. a list of the found constants defined in a program which are
used throughout the program) - one field of each table entry is used to store the constant - now suppose we had three
constants - an int, a float and a boolean. It would be nice if we could store each of these constants together in a single
item, so regardless of the actual type of the data, it could be accessed in the same way.

That's pretty much what a union allows us to do: we can access an area of storage which (in this example) can store any
of the three types we defined. For example, in C or C++ we may define and use a union like this:

union FlexibleType {
int intElement;
float floatElement;
};

union FlexibleType test;


test.intElement = 27; // Legal! Our union type can accept an int!
float x = test.floatElement; // Also legal! Our union type can also accept a float!

CRICOS Provider No. 00103D Page 3 of 8


The design issues around union types are mainly concerned with:

- Should type checking be required? Note than any such type checking must be dynamic.
- Should unions be embedded in records?

8. What are the design issues for pointer types, and what are the two common problems with pointers?
[Q35+36]

A pointer type is one in which the variables have a range of values that consist of memory addresses and a special value,
nil. The nil value is not a valid memory address, and is used to indicate that a pointer does not currently reference a
memory cell.

Pointers are designed for two distinct types of uses:

- Firstly, they provide some of the power of indirect addressing, which is frequently used in assembly language
programming.
- Secondly, pointers provide a way to manage dynamic storage - i.e. data which is dynamically allocated on the
heap (heap-dynamic variables).

Unlike arrays and records, pointers are not structured types - how could they be? We have no idea what type of data is at
the memory address they point at! Instead, pointers allow us to create dynamic data structures at runtime - for example,
binary trees where we have no idea how big the tree will be until we read some file or data over a network.

Common problems with pointers include (accept any two common pointer problems, such as):

- Pointer arithmetic cannot be used if the data structure we're pointing to is jagged. For example, consider a
pointer to an array of Strings (where any given String may be of any size) – we would not be able to use pointer
arithmetic to directly access the memory address of the String at index 1 because we have no idea how big the
String at index 0 is (UNLESS – we use fixed size strings!). Even if we do know the size of each element in the
structure (for example, and array of ints), we must be absolutely precise in our pointer arithmetic if we are to
correctly access the data being pointed to.

- The notation for pointers can be complex, for example:


o int* foo, bar; // foo is a pointer to an int, but bar is a regular int!
o *foo = 3; // We must dereference the pointer to assign it
o cout << "value of foo: " << foo << endl; // Prints mem address not value!
o int **baz; // This is a pointer to a pointer to an int…
o etc.

- Pointers may be null / nil, and attempts to dereference them will cause a program to crash (or raise a
NullPointerException) as this is not a valid memory address.

- Pointers are a significant cause of memory leaks. Allocating a pointer to a block of memory and then allocating it
again to another block of memory leaks the first block of memory, which never got freed! Further, programmers
must remember to delete single items and delete[] array items of heap-dynamic memory allocated through
pointers!

- Pointers may be dangling, that is – at one stage the pointer pointed to some data. Now that data may not be
there, so the pointers point at a memory address which is either does not contain the data we think it does, or
contains other data, which if we modify can cause all sorts of problems!

CRICOS Provider No. 00103D Page 4 of 8


9. Explain what loosely typed and strongly typed languages are, giving two examples of each [Q45-
Mod]

A programming language is strongly typed if type errors are always detected. This requires that the types of all operands
can be determined, either at compile time or at run time. The importance of strong typing lies in its ability to detect all
misuses of variables that result in type errors. A strongly typed language also allows the detection, at run time, of uses of
the incorrect type values in variables that can store values of more than one type. Java and C++ (amongst others) are
two examples of strongly typed languages.

A loosely typed language is a programming language that does not require the type of variables to be defined. For
example, early versions of BASIC were loosely typed. In terms of modern languages,Perl is a loosely typed language -
you can declare a variable, but it doesn't require you to classify the type of variable. In the example below the first line
declares the $test variable that can be used as an integer or string.

my $test;
$test = 1; #Test variable is now integer.
$test = "hello"; #Test variable is now a string.

Problem Set

1. What are arguments for and against representing Boolean values as single bits in memory? [PS1]

For:

- Can pack multiple Booleans into one byte of data (storage efficiency)
- Faster read/write of data to file if stored in a more efficient format.

Against:

- Memory cell alignment issues can occur if we do not keep our data byte aligned. This leads to more than a single
read or write being required to write multi-byte data when it is not byte-aligned. These multiple read/writes
decrease program execution speed.
- In can increase the complexity of the code required to read back the data, as we typically do not read single bits
of memory, we read whatever the bus-width of our computer is, for example 32-bits or 64-bits.

2. Multidimensional arrays can be stored in row major order, as in C++, or in column major order, as in
Fortran. Develop the access functions for both of these arrangements for two-dimensional arrays.
[PS10-Mod]

For data to be stored in row-major order, the data for rows is stored together. For example, if we had a 3x3 array
containing the values 1 through 9, then it would look like this:

1 2 3 <- row 0

4 5 6 <- row 1

7 8 9 <- row 2

To access the data at the ith row and jth column we would use:

CRICOS Provider No. 00103D Page 5 of 8


location(a[I, j]) = address of a[0,0] + ((((num rows above ith row) * (size of row)) + num elements left on jth row)
* element size)

location(a[i, j] = address of a[0,0] + (((i - 1) * n) + j) * element_size)

This assumes i is greater than zero, otherwise i -1 will be -1, and we don't have a -1th row!

For data to be stored in column-major order, the data for columns is stored together. For example, if we had a 3x3 array
containing the values 1 through 9, then it would look like this:

Col0 Col1 Col2

| | |

1 4 7

2 5 8

3 6 9

To access the data at the ith column and jth row we would use:

location(a[i, j]) = address of a[0,0] + ((((num columns before ith column) * (size of column)) + num elements left
on jth row) * element size)

location(a[i, j] = address of a[0,0] + (((i - 1) * n) + j) * element_size)

This is the same as getting row-major data but this time we are using i for the COLUMNS and j for the ROWS (i.e.
columns and rows are swapped). This also assumes i is greater than zero, otherwise i -1 will be -1, and we don't have a -
1th column!

3. Write a short discussion of what was lost and what was gained in Java’s designers’ decision to not
include the pointers of C++.[PS14]

Pointers allow us a great deal of very fine-grained access to data structures in a very fast manner – however,
they are complex and can be difficult to use. Writing our own pointer arithmetic to dive into the middle of large
amount of data at the exact right location can be tricky to get right (as I hope answering question 2 has taught
you) – and the consequences for us not getting things just right are that our programs fail majorly.

By omitting pointers from the language, Java has undoubtedly gained reliability – because as mentioned,
pointers can be very difficult to do correctly. The lack of pointers simplifies the language, leading to more
correct code…

…however, java does pass objects by reference, so what we have is, in effect, a pointer to an object.

A good series of points from stack overflow:

http://stackoverflow.com/questions/9595636/why-java-doesnt-support-pointers

You have to distinguish between several uses of pointers:

 Memory access via pointer arithmetic - this is fundamentally unsafe. Java has a robust security
model and disallows pointer arithmetic for this reason. It would be impossible for the JVM to ensure

CRICOS Provider No. 00103D Page 6 of 8


that code containing pointer arithmetic is safe without expensive runtime checks. You don't need
pointer arithmetic unless you are writing extremely low level code (in which case you should
probably be using assembler or C/C++ instead)
 Array access via pointer offsets - Java does this via indexed array access so you don't need
pointers. A big advantage of Java's indexed array access is that it detects and disallows out of
bounds array access, which can be a major source of bugs. This is generally worth paying the price
of a tiny bit of runtime overhead.
 References to objects - Java has this, it just doesn't call them pointers. Any normal object reference
works as one of these. When you do String s="Hello"; you get what is effectively a pointer to a
string object.
 Passing argument by reference, i.e. passing a reference which allows you to change the value of a
variable in the caller's scope - Java doesn't have this, but it's a pretty rare use case and can easily be
done in other ways. This is in general equivalent to changing a field in an object scope that both the
caller and callee can see.
 Manual memory management - you can use pointers to manually control and allocate blocks of
memory. This is useful for some applications (games, device drivers) but for general purpose OOP
programming it is simply not worth the effort. Java instead provides very good automatic garbage
collection which takes care of memory management for you. This is an extremely good thing: for
many people who had previously been forced to deal with manual memory management in
Pascal/C/C++ this was one of the biggest advantages of Java when it launched.

So overall Java doesn't have pointers (in the C/C++ sense) because it doesn't need them for general
purpose OOP programming. Furthermore, adding pointers to Java would undermine security and
robustness and make the language more complex.

4. What are the arguments for and against Java’s implicit heap storage recovery (i.e. garbage
collection), when compared with the explicit heap storage recovery required in C++? Consider
performance critical / real-time systems such as video games in your answer. [PS15-Mod].

Java's implicit heap storage recovery (i.e. garbage collection) is a reference-counted way to enable the automatic
recovery of memory allocations which are no longer used (the JVM can tell they are no longer being used because there
are no references remaining to the allocated memory!). This means that we cannot really leak memory as would happen
with C++ - whereby the memory is permanently unavailable to the system until it's been rebooted. And that's a great
feature….

…however, we do not always have quite the amount of control we would like over when the garbage collector runs. We
can suggest that the garbage collector should run, and it may (or may not, if it deems it unnecessary). What we
CANNOT do is tell the garbage collector NOT to run until we say it can (although it may be possible to configure JVM
settings to make more resources available and perhaps decrease the frequency of when it runs).

The problem with this, is that garbage collection takes time and processing. If our application is a game or other
performance critical application, the garbage collection running can stall our program, in a video game this would make
our frame rate stutter, in a performance-critical safety-critical application, it may even have significant effects (for
example, car air-bag monitoring system garbage-collects for 100ms? If that's the exact 100ms the car is in a collision it
could have significant implications and may even lead to a loss of life!)

5. Make two short lists of applications of matrices, one for those that require jagged matrices and one for
those that require rectangular matrices. Now, argue whether just jagged, just rectangular, or both
should be included in a programming language. [PS18]

CRICOS Provider No. 00103D Page 7 of 8


Jagged matrices:

- Storing results for test scores from multiple classes of students (each class having different numbers of students)
in a 2D array.
- Storing the votes taken for each political party along with who cast the vote in a 2D array.
- Storing a record for each day of each month (month having varying numbers of days)
- Any valid others.

Rectangular matrices

- Storing images or numerical matrices,


- Storing multiplication tables or other 'regular' pre-calculated values,
- Any valid others.

Both types of arrays should be included in any modern programming language because of the need to model the real
world and its data, which does not always fit into rectangular arrays. It could be argued that only allowing rectangular
arrays would be faster for look-up than jagged arrays, but the counter argument to this is that the language should
provide both as separate mechanisms, so when indexing into a rectangular array is required then that technique is used,
and when indexing into jagged arrays that (potentially slower) technique is used.

[Any other valid, logical explanation of which types of arrays should be included is acceptable – but there must be
good, solid reasoning behind the choice made]

6. In what way is static type checking better than dynamic type checking? [PS21]

Static type checking is able to determine type errors / incompatibilities at compile time at which point the programmer
can fix the program code so that it works correctly. Dynamic type checking occurs at runtime – by which point it may be
too late to do anything about a type error (except perhaps raise an exception).

It is better to find bugs / problems in code earlier rather than later, so at compile time so that the financial cost of fixing
the bug is as low as possible. Fixing a bug such as a type checking bug in a program which has already been released
and distributed may cost hundreds if not thousands of times as much as fixing it at compile time during program
development.

Source: http://www.happyjar.com/comic/poster/

CRICOS Provider No. 00103D Page 8 of 8

You might also like