0% found this document useful (0 votes)
79 views

Principles of Programming Languages: UNIT II - Intro To Programming Concepts Lecture 7 - Data Types

The document discusses different data types in programming languages including primitive data types, character string types, user-defined ordinal types like enumerations and sub-ranges, array types, record types, and pointer and reference types. It provides examples of these types in languages like C, C++, Java, and others, describing how they are defined and implemented. The document also covers topics like type binding, type checking, strong typing, and the theory behind different data types.

Uploaded by

veningston
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Principles of Programming Languages: UNIT II - Intro To Programming Concepts Lecture 7 - Data Types

The document discusses different data types in programming languages including primitive data types, character string types, user-defined ordinal types like enumerations and sub-ranges, array types, record types, and pointer and reference types. It provides examples of these types in languages like C, C++, Java, and others, describing how they are defined and implemented. The document also covers topics like type binding, type checking, strong typing, and the theory behind different data types.

Uploaded by

veningston
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Principles of Programming

Languages

UNIT II – Intro to Programming Concepts


Lecture 7 – Data types
28/08/2017 – III CSE – D – 35 abs.
28/08/2017 – III CSE - B

0
Introduction
• What do you mean when you declare?
int n;
– Possible values of n?
– Possible operations on n?
• How about freshman defined below?
typedef struct {
char name[30];
int student_number;
} Student;
Student freshman = {“John", 644101};

1
Data Type
• A data type is a set
– When we declare that a variable has a certain type,
we say that,
(1) the values that the variable can have are elements
of a certain set, and
(2) there is a collection of operations that can be
applied to those values
• Fundamental issues for PL designers:
– How to define a sufficient set of data types?
– What operations are defined and how are they
specified for a data type?

2
Evolution of Data Types
• Earlier PL’s tried to include many data types to
support a wide variety of applications, e.g. PL/1
• Wisdom from ALGOL 68:
– A few basic types with a few defining operators for
users to define their own according to their needs
• From user-defined type to abstract data type
– The interface of a type (visible to user) is separated
from the representation and set of operations on
values of that type (invisible to user)

3
Uses for Types
• Indicate intended use of declared identifiers
– Types can be checked
• Identify and prevent errors
– Compile-time or run-time checking can prevent
meaningless computations, e.g., 3+TRUE–
“Bill”
• Support optimization
– Example: short integers require fewer bits
– Access record component by known offset

4
Overview
• Various Data Types
– Primitive Data Types, Character String Types, User-
Defined Ordinal Types, Array Types, Jagged Array
Associative Arrays (Key-Value pair), Record Types,
Union Types, Pointer and Reference Types
• Type Binding
• Type Checking
• Strong Typing
• Type Equivalence
• Theory and Data Types
5
Primitive Data Types
• Almost all programming languages provide a
set of primitive data types
– Integer, floating-point, Boolean, character
• Some primitive data types are merely
reflections of the hardware
• Others require only a little non-hardware
support for their implementation

6
Character String Types
• Values are sequences of characters
• Design issues:
– Is it a primitive type or just a special kind of array?
– Should the length of strings be static or dynamic?
– What kinds of string operations are allowed?
• Assignment and copying: what if have different lengths?
• Comparison (=, >, etc.)
• Concatenation
• Substring reference: reference to a substring
• Pattern matching

7
String in C++
• It is a primitive data type here
#include <iostream>
using namespace std;

int main()
{
string str;
cin>>str;
cout<<"\n STRING : "<<str;
}

8
Character String in C
• C
– Not primitive; use char arrays, terminate with a null
character (‘\0’)
– A library of functions to provide operations instead of
primitive operators
– Problems with C string library:
• Functions in library do not guard against overflowing the
destination, e.g., strcpy(src, dest);
What if src is 50 bytes and dest is 20 bytes?
• Java
– Primitive via the String and StringBuffer class

9
Java – String vs. StringBuffer
• Java provides the StringBuffer and
String classes
– String class is used to manipulate
character strings that cannot be changed.
• Objects of type String are read only and immutable.
– The StringBuffer class is used to represent
characters that can be modified.
• Objects of type StringBuffer are mutable.

10
Example

#include <stdio.h>
#include <string.h>
int main()
{
char src[20]="Hello";
char dest[50]="HelloHelloHelloHelloHelloHelloHelloHelloHello";
strcpy(src,dest);
printf("\n SRC : %s",src);
printf("\n DEST : %s",dest);

return 0;
}

11
What will be the output?

#include<stdio.h>
#include<string.h>
int main()
{
char a[30]="hello";
char b[2];
strcpy(b,a);
printf("\n %s",b);
return 0;
}

12
Character String Length Options
• Static: COBOL, Java’s String class
– Length is set when string is created
• Limited dynamic length: C and C++
– Length varying, up to a fixed maximum
– In C-based language, a special character is used to
indicate the end of a string’s characters, rather than
maintaining the length
• Dynamic (no maximum): SNOBOL4, Perl,
JavaScript
• Ada supports all three string length options
13
Character String Implementation
Collection of
• Static length: compile-time descriptor variable’s attr.

• Limited dynamic length: may need a run-time


descriptor for length (but not in C and C++)
• Dynamic length: need run-time descriptor;
allocation/deallocation is the biggest implementation
problem

14
User-Defined Ordinal Types
• Ordinal type: range of possible values can be
easily associated with positive integers
• Two common user-defined ordinal types
– Enumeration: All possible values, which are
named constants, are provided or enumerated in
the definition, e.g., C# example
enum days {mon, tue, wed, thu, fri, sat, sun};
– Sub-range: An ordered contiguous subsequence
of an ordinal type, e.g., 12..18 is a sub-range of
integer type

15
Enumeration Types
• A common representation is to treat the
values of an enumeration as small integers
– May even be exposed to the programmer, as is in
C:
enum coin {penny=1, nickel=5, dime=10, quarter=25};
enum escapes {BELL='\a', BACKSPACE='\b', TAB='\t',
NEWLINE='\n', VTAB='\v', RETURN='\r' };

• If integer nature of representation is exposed,


may allow some or all integer operations:
Pascal: type SUMMER = (April, May, June, July, September);
for C := April to September do writeln(C);
C: int x = penny + nickel + dime; 16
Evaluation of Enumerated Type
• Aid to readability, e.g., Need to code a color as
a number
• Aid to reliability, e.g., compiler can check:
– Operations (Allow colors to be added)
– No enumeration variable can be assigned a value
outside its defined range
– Ada, C#, and Java 5.0 provide better support for
enumeration than C++ because enumeration type
variables in these languages are not forced into
integer types
17
Enumerated Data type in C/C++
• User defined data type
• It is mainly used to assign names to integral
constants, the names make a program easy to
read and maintain.
• The keyword ‘enum’ is used to declare new
enumeration types in C and C++

18
#include<stdio.h>
Examples
#include <stdio.h>
enum week{Mon, Tue, Wed, Thur, Fri, Sat, Sun}; enum State {Working = 1, Failed = 0, Freezed = 0};

int main() int main()


{ {
enum week day; printf("%d, %d, %d", Working, Failed, Freezed);
day = Wed; return 0;
printf("%d",day); }
return 0;
}

#include<stdio.h>
#include <stdio.h>
enum year{Jan, Feb, Mar, Apr, May, Jun, Jul, enum day {sunday, monday, tuesday, wednesday,
Aug, Sep, Oct, Nov, Dec}; thursday, friday, saturday};

int main() int main()


{ {
int i; enum day d = thursday;
for (i=Jan; i<=Dec; i++) printf("The day number stored in d is %d", d);
printf("%d ", i); return 0;
return 0; }
}

19
Examples
#include <stdio.h>
enum day {sunday = 1, monday, tuesday = 5, #include <stdio.h>
wednesday, thursday = 10, friday, saturday}; enum State {Working = 1, Freezed = 10, failed};
enum result {Failed, passed};
int main()
{ int main()
printf("%d %d %d %d %d %d %d", sunday, {
monday, tuesday, wednesday, thursday, friday, printf("%d, %d, %d", Working, failed, Freezed);
saturday); return 0;
return 0; }
}

• All enum constants must be unique in their scope.

enum state {working, failed};


enum result {failed, passed};

int main() { return 0; }

error: redeclaration of enumerator 'Failed'

20
Drawback of enum
• Numeric values assigned to enumeration type
variables are checked to determine whether
they are in the range of the internal values of
the enumeration type.
enum colors {red = 1, blue = 1000, green = 100000}

• A value assigned to a variable of colors type


will only be checked to determine whether it
is in the range of 1..100000.
• This checking is not effective.

21
Subrange Types
• A subrange type is a contiguous subsequence of
an ordinal type.
– For example, 12..14 is a subrange of integer type.
• Subrange types were introduced by Pascal and are
included in Ada.
• Ada’s design
– Not new type, but rename of constrained versions
Examples:
type Days is (mon, tue, wed, thu, fri, sat, sun);
subtype Weekdays is Days range mon..fri;

subtype Index is Integer range 1..100;

22
Subrange Types
• Usually, we just use the same representation for
the subtype as for the supertype
– May be with code inserted (by the compiler) to
restrict assignments to subrange variables. Eg. Ada
Day1: Days;
Day2: Weekdays;

Day2 := Day1;

• the assignment is legal unless the value of Day1


is Sat or Sun.
• The compiler must generate range-checking code
for every assignment to a subrange variable.
23
Subrange Types
• Subrange evaluation
– Aid to readability: make it clear to the readers that
variables of subrange can store only certain range
of values
– Reliability: assigning a value to a subrange variable
that is outside specified range is detected as an
error

24
Array Types
• Array: an aggregate of homogeneous data
elements, in which an individual element is
identified by its position in the aggregate
– Array indexing (or subscripting): a mapping from
indices to elements
– Two types in arrays: element type, index type
• Array index types:
– FORTRAN, C,C++: integer only
– Pascal: any ordinal type (integer, Boolean, char)
– Java: integer types only
– Python: Negative index and positive index
– C, C++, Perl, and Fortran  do not specify range
checking, while Java, ML, C# do
25
Categories of Arrays
• Static: subscript ranges are statically bound and
storage allocation is static (before run-time)
– Advantage: efficiency (no dynamic allocation)
– C and C++ arrays that include static modifier
• Fixed stack-dynamic: subscript ranges are
statically bound, but the allocation is done at
declaration time during execution
– Advantage: space efficiency
– C and C++ arrays without static modifier

26
#include <stdio.h>
void update(int);

int main(void)
Static Array in C
{
update(1);
update(0);
return 0;
}

void update(int FLAG)


{
static int count[10] = {12, 34, 45, 123, 1, 3, 56, 90, 88, 100};
int i;

if (FLAG)
{
for(i = 0; i < 10; i++)
count[i] += 5;
printf("Updated data!\n");
}
else  It retains its last modified values
{ (Global Copy) each time function ‘update()’
printf("No need!\n");
}
entered
} 27
Categories of Arrays (cont.)
• Stack-dynamic: subscript ranges and storage
allocation are dynamically bound at
elaboration/run time and remain fixed during
variable lifetime
– Advantage: flexibility (the size of an array need not be
known until the array is to be used), e.g., Ada

Get(List_Len); // input array size


declare
List: array (1..List_Len) of Integer;
begin
...
End // List array deallocated

28
Categories of Arrays (cont.)
• Fixed heap-dynamic: storage binding is dynamic
but fixed after allocation, allocated from heap
– C and C++ through malloc
• Heap-dynamic: binding of subscript ranges and
storage is dynamic and can change
– Advantage: flexibility (arrays can grow or shrink
during execution), e.g., Perl, JavaScript
– C#: through ArrayList; objects created without
element and added later with Add.
ArrayList intList = new ArrayList();
intList.Add(nextOne);
29
Array Initialization and Operations
• Some languages allow initialization at the time of
storage allocation
– C, C++, Java, C# example:
int list[] = {4, 5, 7, 83};
char name [] = "freddie";
– Java initialization of String objects
String[] names = {“Bob”, “Jake”, “Joe”};
• Ada allows array assignment and catenation
• C-based languages do not provide any array operations
• Fortran provides elemental operations, e.g.,
 + operator between two arrays results in an array of the
sums of the element pairs of the two arrays

30
Rectangular Arrays
• A rectangular array is a multi-dimensioned array in
which all of the rows have the same number of
elements and all of the columns have the same
number of elements.
• Rectangular arrays model rectangular tables exactly.
• Multi-dimensional:
– 1D
– 2D
– 3D
– 4D
– …

31
Multi-dimensional Arrays - Example
• 1-dimensional array.

• 3-dimensional array (3rd dimension is the day).


Oct 14
Oct 15
Oct 16
Jagged Arrays
• A jagged array is one in which the lengths of
the rows need not be the same.
• For example, a jagged matrix may consist of
three rows,
– one with 5 elements,
– one with 7 elements, and
– one with 12 elements.

33
Jagged Array

34
Compile-time descriptor for single-dimensioned
arrays

35
A compile-time descriptor for a
multidimensional array
• If the array is a matrix, it is stored by
rows. For example, if the matrix had
the values

• In row major order,


– 3, 4, 7, 6, 2, 5, 1, 3, 8
– Row-major order is used in C, C++,
Java, Scala, Pascal, C#
• In column major order,
– 3, 6, 1, 4, 2, 3, 7, 5, 8
– Fortran, MATLAB uses Colum-major
order
36
Associative Arrays
• An unordered collection of data elements
indexed by an equal number of values (keys)
– User defined keys must be stored
• In the case of non-associative arrays,
– The indices never need to be stored (because of
their regularity).
• In an associative array,
– The user-defined keys must be stored in the
structure.

37
Examples

38
Associative Arrays - Example
• Perl:
%hi_temps = ("Mon" => 77, "Tue" => 79,
“Wed” => 65, …);
• % begins the name of a key-value pair variable
$hi_temps{"Wed"} = 83;
• $ begins the name of a scalar variable
delete $hi_temps{"Tue"};
• Elements can be removed with delete

39
Associative Arrays - Example
• Python:
– Dictionary  Keys are unique within a dictionary
while values may not be
dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'};
– Dictionary Operations:
• len(dict)
• cmp(dict1,dict2)
• ….

40
Implementing Associate Array
• A 32-bit hash value is computed for each entry
and is stored with the entry, although an
associative array initially uses only a small part
of the hash value.
• When an associative array must be expanded
beyond its initial size, the hash function need
not be changed

41
Hash function

42
Record Types
• A record is a possibly heterogeneous aggregate of
data elements in which
– the individual elements are identified by names, and
– accessed through offsets from the beginning of the
structure.
• Introduced by COBOL, it uses level numbers to
show nested records - COBOL Record declaration

01 EMPLOYEE-RECORD.
02 EMPLOYEE-NAME.
05 FIRST PICTURE IS X(20).
05 MID PICTURE IS X(10).
05 LAST PICTURE IS X(20).
02 HOURLY-RATE PICTURE IS 99V99.
43
• Ada  Record declaration
– In Ada, records cannot be anonymous — they
must be named types.
type Employee_Name_Type is record
First : String (1..20);
Middle : String (1..10);
Last : String (1..20);
end record;

type Employee_Record_Type is record


Employee_Name: Employee_Name_Type;
Hourly_Rate: Float;
end record;

Employee_Record: Employee_Record_Type;
44
Record Types
• In C, C++, and C#,
– records are supported with the struct data type.
• In C++, Java, C#
– structures are a minor variation on classes.
• Examples:
struct BankAccount{ struct StudentRecord{
char Name[15]; char Name[15];
int AcountNo[10]; int Id;
double balance; char Dept[5];
Date Birthday; char Gender;
}; };
struct Date {
int day;
int month;
int year;
} ; 45
Record Types
• In Python and Ruby,
– Records can be implemented as hashes, which
themselves can be elements of arrays.

46
Difference b/t Record & Array
• Record elements, or fields, are not referenced
by indices.
• The fields are named with identifiers, and
references to the fields are made using these
identifiers.

47
References to Record Type and
Operations
1. COBOL
– field_name OF record_name_1 OF ... OF record_name_n
2. Others (dot operator)
struct motor
{
float volts; //voltage of the motor
float amps; //amperage of the motor
int phases; //# of phases of the motor
float rpm; //rotational speed of motor
}; //struct motor
struct motor p;
p.volts — is the voltage
p.amps — is the amperage
p.phases — is the number of phases
p.rpm — is the rotational speed
• Operations:
– Ada allows record comparison
• Assignment is common if types are identical
– COBOL provides MOVE CORRESPONDING 48
• Copies a field of the source record to the corresponding field in the target record
A compile-time
descriptor for a record
• The fields of records are
stored in adjacent memory
locations.
• The sizes of the fields are not
necessarily the same,
– The access method used for
arrays is not used for records.
– Instead, the offset address,
relative to the beginning of the
record, is associated with each
field.
49
List type in Python
• List Types list1 = ['physics', 'chemistry', 1997, 2000];
list2 = [1, 2, 3, 4, 5 ];
list3 = ["a", "b", "c", "d"];
print ("list1[0]: ", list1[0])
print ("list2[1:5]: ", list2[1:5])
– Operations:
• Length
• Concatenation
• Repetition
• Membership
• Iteration
• Compare two lists
• Length of the list
• Maximum in a list
• Minimum in a list
• Convert list to tuple
50
Tuple Types in Python
• Tuple Type
– A tuple is a sequence of immutable Python
objects.
– Tuples are sequences, just like lists.
– The differences between tuples and lists are, the
tuples cannot be changed unlike lists and tuples
use parentheses ( ), whereas lists use square
brackets [ ]

51
• Tuple Type tup1 = ('physics', 'chemistry', 1997, 2000);
tup2 = (1, 2, 3, 4, 5 );
tup3 = "a", "b", "c", "d";
print ("tup1[0]: ", tup1[0])
print ("tup2[1:5]: ", tup2[1:5])

– Operations
• Length
• Concatenation
• Repetition
• Membership
• Iteration
• Compare two lists
• Length of the list
• Maximum in a list
• Minimum in a list
• Convert list to tuple 52
Unions Types
• A union is a type whose variables are allowed to
store different types of values at different times
union time
{
long simpleDate;
double perciseDate;
} mytime;
...
printTime(mytime.perciseDate);
• Design issues:
– Should type checking be required?
– Should unions be embedded in records?
53
• free unions
– Programmers are allowed complete freedom from
type checking in their use. Eg. C
• Discriminated unions
– Type checking of unions requires that each union
construct include a type indicator. Such an indicator is
called a tag, or discriminant, and a union with a
discriminant is called a discriminated union.
• The first language to provide discriminated unions was
ALGOL 68.
• They are also supported by Ada, ML, Haskell, and F#.
54
• Free Union - Example
union flexType • This last assignment is not
{ type checked, because the
int intEl; system cannot determine the
float floatEl; current type of the current
}; value of el1,
union flexType el1; • so it assigns the bit string
representation of 27 to the
float x; float variable x, which of
... course is nonsense.
el1.intEl = 27;
x = el1.floatEl;
55
Discriminated vs. Free Unions
• In Fortran, C, and C++, no language-supported
type checking for union, called free union
• Most common way of remembering what is in a
union is to embed it in a structure
struct var_type
{
int type_in_union;
union
{
float un_float;
int un_int;
} vt_un;
} var_type;
56
Discriminated vs. Free Unions
• Discriminated union: in order to type-checking
unions, each union includes a type indicator
called a discriminant
– Supported by Ada
– Can be changed only by assigning entire record,
including discriminant  no inconsistent records

57
Ada Union Types
type Shape is (Circle, Triangle, Rectangle);
type Colors is (Red, Green, Blue);
type Figure (Form: Shape) is record
Filled: Boolean;
Color: Colors;
case Form is
when Circle => Diameter: Float;
when Triangle =>
Leftside, Rightside: Integer;
Angle: Float;
when Rectangle => Side1, Side2: Integer;
end case;
end record;
Figure_1 := (Filled => True,
Color => Blue, Form => Rectangle,
58
Side_1 => 2, Sice_2 => 3);
Ada Union Types
• A discriminated union of three shape variables

59
Ada: A compile-time descriptor for a
discriminated union

60
Evaluation of Unions
• C’s Free unions are unsafe
– Do not allow type checking
• Java and C# do not support unions
– Reflective of growing concerns for safety in
programming language
• Ada’s discriminated unions are safe

61
Pointer and Reference Types
• A pointer type variable has a range of values
that consists of memory addresses and a
special value, NIL
• Provide the power of indirect addressing
• Provide a way to manage dynamic memory
• A pointer can be used to access a location in
the area where storage is dynamically created
(heap)

62
Design Issues of Pointers
• What are the scope and lifetime of a pointer
variable?
• What is the lifetime of a heap-dynamic variable?
• Are pointers restricted as to the type of value to
which they can point?
• Are pointers used for dynamic storage
management, indirect addressing, or both?
• Should the language support pointer types,
reference types, or both?

63
Pointer Operations
• Two fundamental operations:
– assignment and dereferencing
• Assignment sets a pointer variable’s value to
some useful address
• Dereferencing yields the value stored at the
location represented by the pointer’s value
– Dereferencing can be explicit or implicit
– C and C++ uses an explicit operator *
j = *ptr
sets j to the value located at ptr

64
Pointer Assignment
• The assignment operation j = *ptr

65
Problems with Pointers
• Dangling pointers (dangerous)
– A pointer points to a heap-dynamic variable that has
been deallocated
 has pointer but no storage
– What happen when deferencing a dangling pointer?
• Lost heap-dynamic variable
– An allocated heap-dynamic variable that is no longer
accessible to the user program (often called garbage)
 has storage but no pointer
• The process of losing heap-dynamic variables is called
memory leakage

66
Dangling Pointer

67
Why & How Dangling Pointers?
• The following sequence of operations creates a
dangling pointer in many languages:
– 1. A new heap-dynamic variable is created and
pointer p1 is set to point at it.
– 2. Pointer p2 is assigned p1’s value.
– 3. The heap-dynamic variable pointed to by p1 is
explicitly deallocated (possibly setting p1 to NIL), but
p2 is not changed by the operation. P2 is now a
dangling pointer.
• If the deallocation operation did not change p1, both p1 and
p2 would be dangling. (Of course, this is a problem of
aliasing—p1 and p2 are aliases.)
68
For example, in C++
int * arrayPtr1;
int * arrayPtr2 = new int[100];
arrayPtr1 = arrayPtr2;
delete [] arrayPtr2;
// Now, arrayPtr1 is dangling, because the heap storage
// to which it was pointing has been deallocated.

• In C++, both arrayPtr1 and arrayPtr2 are now dangling pointers, because
the C++ delete operator has no effect on the value of its operand pointer.
• In C++, it is common (and safe) to follow a delete operator with an
assignment of zero, which represents null, to the pointer whose pointed-
to value has been deallocated.
• Notice that the explicit deallocation of dynamic variables is the cause of
dangling pointers.
69
Pointers in C and C++
• Extremely flexible but must be used with care
• Pointers can point at any variable regardless of when it
was allocated
• Used for dynamic storage management and addressing
• Pointer arithmetic is possible
• Explicit dereferencing (*) and address-of operators (&)
• If Domain type need not be fixed, use void * pointer
– void * can point to any type and can be type checked
– cannot be de-referenced with out type

70
Pointer Arithmetic in C and C++

float stuff[100];
float *p;
p = stuff;

*(p+5) is equivalent to stuff[5] and p[5]


*(p+i) is equivalent to stuff[i] and p[i]

71
Reference Types
• A reference type variable refers to an object or a value in memory,
while a pointer refers to an address
 not sensible to do arithmetic on references
• C++ reference type variable: a constant pointer that is implicitly
dereferenced; primarily for formal parameters  initialized with
address at definition, remain constant
– Reference type variables are specified in definitions by
preceding their names with ampersands (&).
int result = 0;
int &ref_result = result;
……
ref_result = 100;
• Java uses reference variables to replace pointers entirely
– Not constants, can be assigned; reference to class instances
String str1;
……
str1 = “This is a string.”;
72
Evaluation of Pointers
• Dangling pointers and dangling objects are
problems as is heap management
• Pointers are like goto's  they widen the
range of cells that can be accessed by a
variable
• Pointers or references are necessary for
dynamic data structures  So, we cannot
design a language without them

73
Solution to Dangling Pointer Problem
• Tombstone: extra heap cell that is a pointer to
the heap-dynamic variable
– The actual pointer variable points only at
tombstones
– When heap-dynamic variable de-allocated,
tombstone remains but is set to NIL
– Any pointer variables pointing to that heap-
dynamic variable will know it is gone by noticing
tombstone becomes NIL
– Costly in time and space

74
Tombstone

75
Solution to Dangling Pointer Problem
• Locks-and-keys:
– Heap-dynamic variable represented as (variable, lock)
– Associated pointer represented as (key, address)
– When heap-dynamic variable allocated,
• a lock is placed in lock cell of the heap-dynamic variable
• as well as the key cell of the corresponding pointer variable
– Any copies of the pointer value to other pointer
variables must also copy the key value
– When a heap-dynamic variable is deallocated, its lock
value is cleared to an NIL
– Any remaining pointers will have a mismatch

76
BEST SOLUTION
• The best solution to the dangling-pointer
problem is
– to take deallocation of heap-dynamic variables out of
the hands of programmers.
• If programs cannot explicitly deallocate heap-
dynamic variables, there will be no dangling
pointers.
• To do this, the run-time system must implicitly
deallocate heap-dynamic variables when they are
no longer useful  Java, C#
– C#’s pointers do not include implicit deallocation.
77
Heap Management
• Two approaches to reclaim garbage
– Reference counters (eager): reclamation is gradual
– Garbage collection (lazy): reclamation occurs when
the list of variable space becomes empty
• Reference counters:
– A counter in every variable, storing number of
pointers currently pointing at that variable
– If counter becomes zero, variable becomes garbage
and can be reclaimed
– Disadvantages: space required, execution time
required, complications for cells connected circularly

78
Garbage Collection
• Run-time system allocates cells as requested and
disconnects pointers from cells as needed.
Garbage collection when out of space
– Every heap cell has a bit used by collection algorithm
– All cells initially set to garbage
– All pointers traced into heap, and reachable cells
marked as not garbage
– All garbage cells returned to list of available cells
– Disadvantages: when you need it most, it works worst
(takes most time when program needs most of cells in
heap)

79
Overview
• Various Data Types
– Primitive Data Types, Character String Types, User-
Defined Ordinal Types, Array Types, Associative Arrays,
Record Types, Union Types, Pointer and Reference
Types
• Type Binding
• Type Checking
• Strong Typing
• Type Equivalence
• Theory and Data Types
80
Type Binding
• Before a variable can be referenced in a program,
it must be bound to a data type
– How is a type specified?
– When does the binding take place?
• If static, the type may be specified by either
– Explicit declaration: by using declaration statements
– Implicit declaration: by a default mechanism, e.g., the
first appearance of the variable in the program
– Fortran, PL/1, BASIC, Perl, Python have implicit
declarations
• Advantage: writability
• Disadvantage: reliability
81
Dynamic Type Binding
• A variable is assigned a type when it is assigned a
value in an assignment statement and is given the
type of RHS, e.g., in Python, JavaScript and PHP
list = [2, 4.33, 6, 8];
list = 17.3;
– Advantage: flexibility (generic for processing data of
any type, esp. any type of input data)
– Disadvantages:
• High cost (dynamic type checking and interpretation)
• Less readable, difficult to detect type error by compiler
 PL usually implemented in interpreters [Eg. Python IDLE]
82
Type Inference
• Types of expressions may be inferred from the
context of the reference, e.g., in ML, Miranda,
and Haskell
fun square(x) = x * x;
– Arithmetic operator * sets function and
parameters to be numeric, and by default to be
int
square(2.75); //error!
fun square(x) : real = x * x; //correct

83
Overview
• Various Data Types
– Primitive Data Types, Character String Types, User-
Defined Ordinal Types, Array Types, Associative Arrays,
Record Types, Union Types, Pointer and Reference
Types
• Type Binding
• Type Checking
• Strong Typing
• Type Equivalence
• Theory and Data Types
84
Type Checking
• The activity of ensuring that the operands of an
operator are of compatible types
– A compatible type is one that is either legal for the
operator, or is allowed to be implicitly converted, by
compiler-generated code, to a legal type, e.g.,
(int) A =(int) B + (real) C
– This automatic conversion is called a coercion
• If all bindings of variables to types are static in a
language, then type checking can nearly always be
done statically.
• Dynamic type binding requires type checking at run
time, which is called dynamic type checking.
85
Strong Typing
• A programming language is strongly typed if type
errors are always detected
– Advantage: allows the detection of the misuses of
variables that result in type errors
• FORTRAN 77 is not: EQUIVALENCE
• Ada is nearly strongly typed
• C and C++ are not: unions are not type checked
• Coercion rules can weaken strong typing
• Example: a and b are int; d is float;
– If a programmer mistakenly type as a + d, the
error would not be detected by the compiler. The
value of a would simply be coerced to float.
86
Type Equivalence
• Type checking checks compatibility of operand
types for operators  compatibility rules
• Simple and rigid for predefined scalar types
• Complex for structured types, e.g., arrays,
structures, user-defined types
– They seldom coerced  no need to check
compatibility
– Important to check equivalence, i.e., compatibility
without coercion  how to define type equivalence?

87
Name Type Equivalence
• Two variables have equivalent types if they are in
either the same declaration or in declarations
that use the same type name
• Easy to implement but highly restrictive:
– Subranges of integer types are not equivalent with
integer types, e.g., Ada
type Indextype is 1..100;
count : Integer;
index : Indextype;
– Formal parameters must be the same type as their
corresponding actual parameters
88
Structure Type Equivalence
• Two variables have equivalent types if their types
have identical structures
• More flexible, but harder to implement  need
to compare entire structures
– Are two record types compatible if they are
structurally the same but use different field names?
– Are two array types compatible if they are the same
except the subscripts? e.g. [1..10] and [0..9]
– Are two enumeration types compatible if their
components are spelled differently?
– How about type celsius & fahrenheit of
float?
89
Type Equivalence in C
• C uses both name and structure type
equivalence
• Name type equivalence for struct, enum,
union
– A new type for each declaration not equivalence
to any other type
• Structure type equivalence for other nonscalar
types, e.g., array
• typedef only defines a new name for an
existing type, not new type
90
Summary
• Data types of a language a large part determine
that language’s style and usefulness
• Primitive data types of most imperative lang.
include numeric, character, and Boolean
• The user-defined enumeration and subrange
types are convenient and add to the readability
and reliability of programs
• Arrays and records included in most languages
• Pointers are used for addressing flexibility and to
control dynamic storage management

91

You might also like