0% found this document useful (0 votes)
8 views63 pages

Principles of Programming Language - Chapter 6

Deduced from concepts of programming language book by Robert Sebesta

Uploaded by

likhi4951
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views63 pages

Principles of Programming Language - Chapter 6

Deduced from concepts of programming language book by Robert Sebesta

Uploaded by

likhi4951
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

6th chapter

6.1 Introduction
A data type bundles together:

A set of values

The operations you can do on those values

Programs work by manipulating data, so it’s crucial that your language’s built-in
types match the real-world concepts in your problem. When they do, your code
is easier to write, read, and update.

Evolution of Data Types


Over the past 60 years, high-level languages have gone from a handful of
types to very flexible systems:

Early Fortrans (pre-1990): Only had arrays, so programmers built lists and
trees inside those arrays.

COBOL: Added decimal-precision control and records (like C structs) for


grouping mixed data.

PL/I: Let you set accuracy for integers and floats.

ALGOL 68: Introduced type constructors—few basic types + operators to


build custom structures—one of the biggest leaps in type design.

User-defined types: Improved readability, modularity, and allowed the


compiler to catch more errors.

Abstract Data Types (ADTs): Common since the mid-1980s, separate


interface (visible operations) from implementation (hidden storage).

Why Type Systems Matter


A language’s type system helps in three ways:

1. Error detection (type checking)

2. Modularization (cross-module interface consistency)

6th chapter 1
3. Documentation (a declaration like CustomerRecord immediately tells you what
data you’re handling)

Understanding type checking rules, implicit conversions (coercions), and when


checks happen (compile-time vs. run-time) is key to writing correct,
maintainable code.

6.2 Primitive Data Types


A primitive data type is one you can’t build from other types—it’s provided by
the language. You use primitives as the building blocks for arrays, records, and
the like.

6.2.1 Numeric Types


Numeric primitives remain central to almost every language.

6.2.1.1 Integer
Stores whole numbers (… –2, –1, 0, 1, 2, …).

Hardware usually supports several fixed sizes:

8-bit, 16-bit, 32-bit, 64-bit

Signed vs. unsigned (no negative values)

Some languages (Python, F#) add arbitrary-precision “long” integers—no


fixed upper limit.

Negative numbers normally use two’s-complement:

Invert bits of the positive value

Add 1

→ Makes addition/subtraction circuits simpler

6.2.1.2 Floating-Point
Models real numbers by storing:

1 sign bit

Exponent

Fraction (mantissa)

6th chapter 2
Because many decimals (e.g., 0.1, √2, e) have no exact binary form, these are
approximate.

Follows IEEE 754 standard on nearly every modern machine:

float (32 bits)

double (64 bits)

Watch out for rounding errors when doing arithmetic repeatedly.

6.2.1.3 Complex
Supported by languages like Fortran and Python.

Pairs of floats (real, imag) represent a + b·i .


In Python:

z = 7 + 3j

6.2.1.4 Decimal
Used in business apps (COBOL, C#, F#) where exact decimal arithmetic is
required (e.g., money).

Stores digits in binary-coded decimal (BCD) instead of pure binary.

Advantage: 0.1 is stored exactly—no 0.0999999… surprises.

Trade-offs:

No exponent (limited range)

A little more storage—6 decimal digits require 24 bits in BCD vs. 20 bits
in pure binary

6.2.2 Boolean Types


Only two values: true / false

Introduced in ALGOL 60 (1960) and nearly universal since

Perfect for flags and switches

Implemented in a full byte on most machines because single-bit addressing


is inefficient

6th chapter 3
6.2.3 Character Types
Characters are stored as numeric codes

Encoding options:

ASCII (7-bit, 128 symbols)

ISO 8859-1 (8-bit, 256 symbols)

For global writing systems, we use Unicode:

UCS-2 (16-bit) → Java, Python, C#, Swift

UTF-32 (32-bit) → every code point

Most languages provide a single-character type:

char in C/Java/C#

Length-1 string in Python

6.3 Character String Types


A string is a sequence of characters (letters, digits, symbols).

Used for: text output/input, labels, file names, user messages, etc.

6.3.1 Design Issues


Primitive vs. Array

Built-in type vs. just arrays of characters

If arrays, string operations = loop code unless library support exists

Static vs. Dynamic Length

Static: Fixed at declaration

Dynamic: Can grow/shrink at run-time

Limited Dynamic: Max size declared; actual length varies at run-time

6.3.2 Common Operations


Assignment – Copy one string into another

Concatenation – "Hello, " + "world!"

6th chapter 4
Substring Reference (Slice) – Extract part of a string, e.g., characters 3–5

Comparison – Equality, inequality, lexicographic order

Pattern Matching (Regex) –


Example: /[A-Za-z][A-Za-z0-9]+/ matches an identifier:

First character: letter

Following characters: letters or digits

C / C++ Style
Strings = char arrays, null-terminated ( '\0' )

Library: <string.h> / <cstring>

strcpy(dest, src) – Copy src into dest (unsafe if too small)

strcat(dest, src) – Append src to dest

strcmp(a, b) – Lexicographic comparison

strlen(s) – Length (excluding null)

⚠️ Buffer overflow hazard – No automatic space check on


destination

Java / C# Style
Java: String (immutable), StringBuilder (mutable)

C#: string (immutable), StringBuilder (mutable)

Common methods:

concat() , substring() , indexOf() , replace()

Regex support through libraries

Python Style
Strings = primitive, immutable type

Treated like read-only character arrays

s[2:5] → slice

s + "!" → concatenation

6th chapter 5
Methods: s.find() , s.replace() , s.upper() , etc.

List-style slicing: s[start:stop:step]

Other Languages
F# – string class, + for concat

ML – Primitive, immutable string; ^ for concat

Swift – String type, + , append() , character collections

6.3.3 String Length Options


Static Length

Fixed at compile time

Fast, but inflexible

Limited Dynamic Length

Max size declared

Example: char buf[100] in C

Dynamic Length

Grows/shrinks at run-time

Examples: Python strings, Java StringBuilder , Ruby strings

6.3.4 Evaluation
Built-in strings → better readability and writability

Immutable strings (Java, Python) → safer, no hidden changes

Mutable string buffers (e.g., StringBuilder ) → more efficient for building large
text

Dynamic length → flexible, but adds allocation/deallocation overhead

6.3.5 Implementation Outline


Static strings:

Compiler keeps a descriptor with:

Type name

6th chapter 6
Fixed length

Address of first character

Limited dynamic:

Run-time descriptor adds current length

Fully dynamic:

Run-time descriptor holds current length and address

Storage allocated on the heap

Heap Storage Strategies:


Linked-list of characters – Simple but slow

Array of pointers to characters – Faster, uses extra memory

Contiguous block (common approach) –

On growth: allocate larger block, copy old text, free old block

6.4 Enumeration Types


An enumeration type (enum) is a data type whose values are all listed in its
definition as named constants.
Enums let you give meaningful names to a fixed set of related values.

6.4.1 Core Definition


Declare an enum by naming each possible value.

Under the hood, most implementations map names to integers (starting at


0 by default).

Example in C#:

enum Day { Mon, Tue, Wed, Thu, Fri, Sat, Sun };

6.4.2 Key Design Questions


Uniqueness of Names

Can the same name appear in two separate enums?

6th chapter 7
If so, how does the compiler distinguish?

Coercion to Integer

Are enum values automatically converted to ints?

Enables math on enums (often unwanted).

Coercion from Integer

Can you assign any integer to an enum variable?

If yes, may assign invalid (out-of-set) values.

6.4.3 How Various Languages Implement Enums

C / C++

enum Color { Red, Blue, Green = 5, Yellow };

Default mapping: Red→0 , Blue→1 , Green→5 , Yellow→6

Enum → int is implicit

int → Enum needs a cast

Weaknesses: allows arithmetic, any int can be cast

Java (since Java 5)

enum Color { RED, BLUE, GREEN, YELLOW; }

Enums are classes derived from java.lang.Enum

No implicit int conversions

Methods:

values() : returns all values

ordinal() : position

toString() : name

Strong safety: No mixing with ints/other enums

C#

6th chapter 8
enum Color { Red, Blue, Green, Yellow };

No implicit int conversion (must cast)

Cannot assign out-of-range values without cast

Safer than C++

ML / F#

type Day = Mon | Tue | Wed | Thu | Fri | Sat | Sun

Tags used internally (no math on values)

Swift

enum Day { case Mon, Tue, Wed, Thu, Fri, Sat, Sun }

Not backed by Int unless specified

Enforced exhaustive switch

Scripting Languages
(Perl, Python, Ruby, JS, PHP)

No native enums

Use constants or dictionaries (hashes)

6.4.4 Advantages & Trade-Offs


Pros:

Readability ( Color.Red vs. 3 )

Safety: compiler limits to valid values

Self-documenting code

Cons:

Extra syntax in some languages

Still need casts for int-based use

6th chapter 9
Not available in many scripting languages

6.4.5 Under-the-Hood Implementation


Enum stored like an integer

Each name = constant integer “tag”

Compiler maps names to values at compile time

6.5 Array Types


An array is a collection of elements of the same type, stored in consecutive
memory.
Accessed via subscript/index, which gives the element’s offset.

6.5.1 Core Definition


Homogeneous: All elements same type

Subscripted access:

arrayName[index] (C/Java)

arrayName(index) (older languages)

Address formula:

baseAddress + (index – lowerBound) * elementSize

Where:

baseAddress : start location

lowerBound : usually 0 or 1

elementSize : bytes per element

6.5.2 Key Design Issues


Index Type & Range Checking

Usually integer

Run-time error vs. silent ignore?

6th chapter 10
When Are Bounds Fixed?

Static: Known at compile-time

Stack-dynamic: Size fixed, allocated on call stack

Heap-dynamic (fixed): Allocated at run-time, fixed size

Heap-dynamic (resizable): Grows/shrinks at run-time

Dimensions

Single: a[0..99]

Multi: m[0..9][0..9]

Rectangular: rows = equal length

Jagged: row lengths differ

Initialization – Can you give a value list?

Slices – Can you access a subarray?

6.5.3 Subscript Bindings & Array Categories


Category Bounds Fixed At Storage From Use Case

Static Compile-time Static segment Global constants, tables

Fixed Stack-Dynamic Compile-time Stack on block Local arrays in C/C++

Fixed Heap-Dynamic Run-time Heap new int[100] in Java/C#

Resizable Heap-Dynamic Run-time Heap Python lists, C# List

Static: fast but inflexible

Stack-dynamic: reused stack space

Heap-dynamic (fixed): exact fit

Resizable: flexible, but reallocation overhead

6.5.4 Array Initialization


C / C++:

int a[] = { 4, 5, 7, 83 };

Java / C#:

6th chapter 11
int[] a = { 4, 5, 7, 83 };

Python:

a = [4, 5, 7, 83]

Perl:

@a = (4, 5, 7, 83);

6.5.5 Array Operations


Operation C/C++ Java/C# Python/Ruby

Assignment No (manual copy) Direct ( a = b ) Direct ( a = b )

Concatenation — + for strings only + for lists

Comparison No built-ins equals() , == ref == (deep), is (identity)

Slices No No a[2:5] , a.slice(2,5)

C/C++: Use loops or memcpy

Java/C#: Use System.arraycopy , List<T> methods

Python: Full slice support

6.5.6 Rectangular vs. Jagged Arrays


Rectangular:

One contiguous block

Row-major (C/Java), Column-major (Fortran)

Jagged:

Array of pointers to rows

Each row can vary in length

Examples:

int[][] jagged = new int[3][];

6th chapter 12
jagged[0] = new int[5];

C#: Supports both int[,] (rectangular) and int[][] (jagged)

6.5.7 Slices
Python

v = [2,4,6,8,10,12]
sub = v[1:4] # [4,6,8]
step = v[0:6:2] # [2,6,10]

Perl

@v = (2,4,6,8,10,12);
@sub = @v[1..4]; # [4,6,8,10]

Ruby

v = [2,4,6,8,10,12]
sub = v.slice(1,3) # [4,6,8]

6.5.8 Evaluation
Arrays are fundamental to programming

Static arrays: fast, but rigid

Dynamic arrays: flexible, slower due to allocation

Slices and built-in ops: improve productivity; avoid manual loops

6.5.9 Implementation Sketch


Descriptor:
Stores element size , bounds , and base address

Access Function (1D):

addressOf(a[i]) = baseAddress + (i – lowerBound) * elementSize

6th chapter 13
Access Function (2D Row-Major):

addressOf(a[i,j]) = base + (((i – i₀)*numCols) + (j – j₀)) * elementSize

Each dimension adds 1 multiplication and 1 addition.

📘 6.6 Associative Arrays


Associative arrays are special types of arrays where each value is linked
with a custom label or name called a key.

These keys work like unique identifiers that help you look up values.

In regular arrays, you use numbers (indexes) like 0, 1, 2… to access


elements.

These follow a sequence, so you don’t need to store the numbers


separately.

But in associative arrays, the keys are user-defined, so they must be


stored with the values.

Each data item is a key-value pair (example: "name" → "Likhita" ).

This section explains associative arrays using Perl’s version of them.

Other languages that support associative arrays directly:

Python, Ruby, Swift

Languages like Java, C++, C#, and F# support them using built-in
classes.

🔧 6.6.1 Structure and Operations


In Perl, associative arrays are called hashes.

They use a hash function to store and retrieve data efficiently.

A hash function converts a key (like a name) into a number to find the value
quickly.

Hash variable names in Perl start with a percent sign (%).

Each item in a Perl hash has:

A key (usually a string like "Gary" )

6th chapter 14
A value (can be a string, number, or reference)

🧾 Example: Creating a Hash in Perl


%salaries = (
"Gary" => 75000,
"Perry" => 57000,
"Mary" => 55750,
"Cedric" => 47850);

🔄 Access / Update a Value


$salaries{"Perry"} = 58850;

➕ Add a New Entry


$salaries{"Shelly"} = 61000;

❌ Delete an Entry
delete $salaries{"Gary"};

🧹 Clear the Entire Hash


%salaries = ();

🔁 Hashes are Dynamic


They grow when new items are added.

They shrink when items are removed or the hash is cleared.

🔍 Check If a Key Exists


if (exists $salaries{"Shelly"}) {
...
}

6th chapter 15
📋 Get Keys, Values, and Loop Through Hash
keys %salaries → Returns all keys

values %salaries → Returns all values

each %salaries → Loop over key-value pairs

🌐 In Other Languages
Python: Associative arrays are called dictionaries

Work like Perl hashes

Values are object references

Ruby: Similar to Perl

Keys can be any object, like strings, numbers, or even other objects

⚙️ Key Flexibility in Different Languages


Perl: keys must be strings

PHP: keys can be strings or integers

Ruby: keys can be any object

⚠️ Important Note: Avoid using keys that can change (like


arrays or hashes) because the hash function may stop
working correctly if the key changes.

🧪 PHP Arrays
PHP arrays are flexible:

Can work as indexed arrays (using numbers)

Or associative arrays (with custom keys)

You can even mix both in the same array

🧾 Swift Dictionaries
Called dictionaries in Swift

All keys must be the same type

Values can be of different types, usually stored as objects

6th chapter 16
✅ When to Use Associative Arrays
Use them when:
- You want to
search values quickly
- You need to
store related data (e.g., name & salary)
Avoid them when:
- You need to
process every item one by one (regular arrays are faster)

🛠️ 6.6.2 Implementing Associative Arrays


Perl hashes allow quick lookups

When more elements are added, they reorganize themselves efficiently

Each entry in Perl’s hash table gets a 32-bit hash value

As the table grows, more bits are used, which helps it scale smoothly

When expanding, only half the elements need to be moved, saving time

PHP also uses hash functions to place elements

But PHP remembers the order in which items were added

Useful for functions like current() and next() to move through elements in
order

6.7 Record Types

What Are Record Types?


A record is a data structure that stores a group of related values, where
each value is labeled with a field name.

Each piece of data inside a record is called a field, and instead of using
index numbers (like in arrays), we use field names to access them.

For example, to store a student's details like name, student number, and
grade point average:

Name can be a string,

6th chapter 17
Student number can be an integer,

GPA can be a floating-point number.

All of these can be grouped into one record.

At first glance, records and heterogeneous arrays might look similar.

But in heterogeneous arrays (arrays with different types of elements),


the values are scattered across different locations in memory (usually
on the heap).

In records, the fields are stored next to each other (adjacent memory),
even if the sizes are different.

Records have been part of many programming languages since the 1960s,
starting with COBOL.

Some modern languages that support object-oriented programming (like


Java or C#) allow data classes to serve as records.

In C, C++, C#, and Swift, the struct keyword is used to define records.

In C++, structs are almost the same as classes.

In C#, structs are special types that are stored on the stack, not the
heap like classes.

Structs are mostly used for organizing data (encapsulation).

In Python and Ruby, you can use hashes (dictionaries) to act like records.

These hashes can also be part of arrays if needed.

6.7.1 Definitions of Records


The main difference between records and arrays is:

Arrays use numeric indexes like [0] , [1] to access elements.

Records use names or identifiers for each field like .Name , .Age .

Also, records can include other structures, like unions (a topic discussed
in Section 6.10).

Example of a record in COBOL:

6th chapter 18
01 EMPLOYEE-RECORD.
02 EMPLOYEE-NAME.
05 FIRST PICTURE IS X(20).
05 Middle PICTURE IS X(10).
05 LAST PICTURE IS X(20).
02 HOURLY-RATE PICTURE IS 99V99.

Explanation of the above:

EMPLOYEE-RECORD contains:

EMPLOYEE-NAME (which itself contains FIRST, MIDDLE, LAST),

and HOURLY-RATE .

The numbers 01 , 02 , 05 are called level numbers. They show the


hierarchical structure.

The PICTURE clause describes the format:

X(20) means 20 characters (alphanumeric),

99V99 means 4 digits with a decimal point between them.

In Java, records are declared using data classes, and nested records are
created using nested classes.

6.7.2 References to Record Fields


To refer to a field inside a record, different languages use different styles.

In COBOL, the format is:

field_name OF record_name_1 OF ... OF record_name_n

You start from the smallest field and move outward.

Example:

To refer to Middle , the correct way is:


Middle OF EMPLOYEE-NAME OF EMPLOYEE-RECORD

Most other languages (like C, Java, etc.) use dot notation:

6th chapter 19
Example:
Employee_Record.Employee_Name.Middle

In dot notation, you start from the outermost record and go to the inner
field.

A fully qualified reference means writing all the enclosing record names
to refer to a specific field.

A shorter version, called an elliptical reference, allows you to skip some


of the enclosing record names, as long as the meaning is clear and not
confusing.

Example of elliptical references (in COBOL):

FIRST

FIRST OF EMPLOYEE-NAME

FIRST OF EMPLOYEE-RECORD

All refer to the same field, as long as there’s no confusion in the


program.

Elliptical references help the programmer write less, but they make the
compiler's job harder and can reduce readability.

6.7.3 Evaluation
Records are widely used and are very useful in programming languages.

Their design is simple, and they are safe to use.

Records and arrays are similar, but they are used in different situations:

Arrays are good when all elements are of the same type and are
processed the same way.

Example: looping through numbers in an array.

Records are better when:

Elements have different types (like a string, an integer, a float),

And when fields are not processed in the same way or order.

6th chapter 20
In records:

Field names work like fixed index names (called static subscripts).

This makes accessing data fast and easy.

If we used dynamic subscripts (changing indexes), it would:

Be slower,

And remove type-checking safety.

Conclusion:

Arrays and records solve related but different problems.

Both are efficient and useful depending on what kind of data you're
working with.

6.7.4 Implementation of Record Types


In memory, the fields of a record are stored next to each other.

But because each field can be a different size, we can’t use array-like
addressing (which expects same-size elements).

Instead, each field is given an offset address, which is the position of the
field relative to the start of the record.

The compiler uses this offset to find the location of each field.

The compiler creates a "descriptor" for the record at compile time.

A descriptor is a structure that stores all details about the record, like
field names, types, and offsets.

This helps the compiler understand how to access the fields.

No special structure is needed during program execution (run-time) to


access fields.

6.8 Tuple Types

What Are Tuple Types?

6th chapter 21
A tuple is a type of data structure that is similar to a record, but with one
big difference: the elements inside a tuple are not given names.

Instead of using field names like Name or GPA , you access tuple elements
by position (like first , second , etc.).

In Python, tuples are immutable, which means you cannot change their
values after you create them.

If you need to modify a tuple:

1. Convert it into a list using list() .

2. Make changes.

3. Convert it back into a tuple using tuple() .

Tuples are useful when you want to protect data from being changed, like
when passing it to a function and making sure the function doesn't modify
it.

Even though tuples are similar to lists in Python, the key difference is:

Lists: can be changed (mutable),

Tuples: cannot be changed (immutable).

Example of a Tuple in Python:

myTuple = (3, 5.8, 'apple')

This tuple has three elements:

an integer 3 ,

a float 5.8 ,

and a string 'apple' .

These elements can be of different types.

You can access a tuple element using brackets and index number:

myTuple[1]

6th chapter 22
This gives the second element of the tuple (in this case 5.8 ), because
Python indexing starts at 0.

Tuples in Python support several operations:

Concatenation using + (adding two tuples together),

Deleting a tuple using del ,

Other functions and operators also work on tuples.

Tuples in ML (Meta Language)


ML also supports tuples, and they must contain at least two elements.

In ML, tuples can also mix different data types, just like Python.

Example of a Tuple in ML:

val myTuple = (3, 5.8, 'apple');

This stores a tuple with:

an integer,

a real number (float),

and a string.

To access an element of a tuple in ML:

#1(myTuple);

This gets the first element from the tuple.

You can also define a new tuple type in ML using a type declaration.

Example:

type intReal = int * real;

This defines a tuple type with:

6th chapter 23
an integer,

and a real number.

The here does not mean multiplication. It just separates the components
and means "a combination of these types."

Tuples in F#
F# also has support for tuples.

You create a tuple by assigning a list of values (separated by commas and


enclosed in parentheses) to a name using the let statement.

Example:

let tup = (3, 5, 7);;

This creates a tuple with 3 numbers.

You can unpack or decompose the tuple like this:

let a, b, c = tup;;

This assigns:

3 to a ,

5 to b ,

7 to c .

When you assign each element of the tuple to a variable like this, it’s called
a tuple pattern.

A tuple pattern is just a way to use multiple variables to hold the values
from a tuple at once.

For tuples with two elements in F#, you can use special functions:

fst(tup) gets the first value,

snd(tup) gets the second value.

6th chapter 24
Use of Tuples in Programming
In Python, ML, and F#, tuples are often used when a function needs to
return more than one value.

In Swift, tuples are passed by value (not by reference), which means:

A function gets a copy of the tuple,

So any changes inside the function don’t affect the original tuple.

This makes tuples a safe way to send grouped data into a function without
risking changes to the original values.

6.9 List Types


No special structure is needed during program execution (run-time) to
access fields.

6.8 Tuple Types

What Are Tuple Types?


A tuple is a type of data structure that is similar to a record, but with one
big difference: the elements inside a tuple are not given names.

Instead of using field names like Name or GPA , you access tuple elements
by position (like first , second , etc.).

In Python, tuples are immutable, which means you cannot change their
values after you create them.

If you need to modify a tuple:

1. Convert it into a list using list() .

2. Make changes.

3. Convert it back into a tuple using tuple() .

Tuples are useful when you want to protect data from being changed, like
when passing it to a function and making sure the function doesn't modify

6th chapter 25
it.

Even though tuples are similar to lists in Python, the key difference is:

Lists: can be changed (mutable),

Tuples: cannot be changed (immutable).

Example of a Tuple in Python:

myTuple = (3, 5.8, 'apple')

This tuple has three elements:

an integer 3 ,

a float 5.8 ,

and a string 'apple' .

These elements can be of different types.

You can access a tuple element using brackets and index number:

myTuple[1]

This gives the second element of the tuple (in this case 5.8 ), because
Python indexing starts at 0.

Tuples in Python support several operations:

Concatenation using + (adding two tuples together),

Deleting a tuple using del ,

Other functions and operators also work on tuples.

Tuples in ML (Meta Language)


ML also supports tuples, and they must contain at least two elements.

In ML, tuples can also mix different data types, just like Python.

Example of a Tuple in ML:

6th chapter 26
val myTuple = (3, 5.8, 'apple');

This stores a tuple with:

an integer,

a real number (float),

and a string.

To access an element of a tuple in ML:

#1(myTuple);

This gets the first element from the tuple.

You can also define a new tuple type in ML using a type declaration.

Example:

type intReal = int * real;

This defines a tuple type with:

an integer,

and a real number.

The here does not mean multiplication. It just separates the components
and means "a combination of these types."

Tuples in F#
F# also has support for tuples.

You create a tuple by assigning a list of values (separated by commas and


enclosed in parentheses) to a name using the let statement.

Example:

let tup = (3, 5, 7);;

6th chapter 27
This creates a tuple with 3 numbers.

You can unpack or decompose the tuple like this:

let a, b, c = tup;;

This assigns:

3 to a ,

5 to b ,

7 to c .

When you assign each element of the tuple to a variable like this, it’s called
a tuple pattern.

A tuple pattern is just a way to use multiple variables to hold the values
from a tuple at once.

For tuples with two elements in F#, you can use special functions:

fst(tup) gets the first value,

snd(tup) gets the second value.

Use of Tuples in Programming


In Python, ML, and F#, tuples are often used when a function needs to
return more than one value.

In Swift, tuples are passed by value (not by reference), which means:

A function gets a copy of the tuple,

So any changes inside the function don’t affect the original tuple.

This makes tuples a safe way to send grouped data into a function without
risking changes to the original values.

6.9 List Types

6th chapter 28
What Are Lists?
Lists are one of the oldest and most common data structures in
programming.

They were first used in a language called Lisp, the first functional
programming language.

Over time, lists became part of many functional languages (like Scheme,
Common Lisp, ML, F#), and more recently, also became available in
imperative languages like Python and C#.

List in Scheme and Common Lisp


In Scheme and Common Lisp, lists are written with parentheses, and
elements are placed without commas.

Example:

(A B C D)

You can also create nested lists (a list inside another list).

Example:

(A (B C) D)

In these languages, code and data use the same format.

For example:

(A B C)

If treated as code, this means: call the function A with parameters B

and C .

Basic List Operations in Scheme


Scheme has four main list functions:

Two functions break down a list.

Two functions build up a list.

6th chapter 29
Breaking Down a List
CAR: Gets the first element of a list.

Example:

(CAR '(A B C)) ; returns A

Note: The quote ' before the list tells the interpreter to treat it as data,
not as code.

CDR: Returns the list without the first element.

Example:

(CDR '(A B C)) ; returns (B C)

Common Lisp also offers shortcut functions:

FIRST , SECOND , ..., TENTH for directly accessing list positions.

Building a List
CONS: Adds an element to the beginning of a list.

Example:

(CONS 'A '(B C)) ; returns (A B C)

LIST: Creates a new list from any number of elements.

Example:

(LIST 'A 'B '(C D)) ; returns (A B (C D))

Lists in ML (Meta Language)


ML also supports lists, but the syntax looks different.

Lists are written with square brackets and commas.

6th chapter 30
Example:

[5, 7, 9]

An empty list is [] , or you can also use the word nil .

Creating a List in ML
To add an item to the front of an existing list, use the double colon :: .

Example:

3 :: [5, 7, 9] ; returns [3, 5, 7, 9]

Important Notes in ML Lists


All list elements must be of the same type.

Example:

[5, 7.3, 9] ; This is illegal because of mixing integer and float

Accessing List Elements in ML


hd (short for “head”) gets the first element:

Example:

hd [5, 7, 9] ; returns 5

tl (short for “tail”) gives the rest of the list without the first element:

Example:

tl [5, 7, 9] ; returns [7, 9]

Lists in F#
F# lists are based on ML, but with a few changes.

6th chapter 31
Elements in a list are separated by semicolons instead of commas.

Example:

[1; 3; 5; 7]

The hd and tl functions are still used but written like methods:

Example:

List.hd [1; 3; 5; 7] ; returns 1

F# also uses :: as the CONS operator to build new lists.

Lists in Python
Python has a built-in list data type that also works like arrays.

Python lists are mutable, which means:

You can change, update, or delete elements.

Creating a List in Python


Lists are created with square brackets and commas:

Example:

myList = [3, 5.8, "grape"]

Accessing a value:

x = myList[1] # assigns 5.8 to x

Python uses 0-based indexing, so the first item is at position 0.

Modifying a List
You can update a list item like this:

myList[1] = 6.5

6th chapter 32
To delete an item from a list:

del myList[1]

List Comprehensions in Python


Python has a powerful feature called list comprehension, which is a short
way to create new lists.

It comes from the set notation style used in the Haskell language.

Syntax:

[expression for variable in list if condition]

Example:

[x * x for x in range(12) if x % 3 == 0]

range(12) gives numbers from 0 to 11.

The condition x % 3 == 0 picks only numbers divisible by 3.

Then x*x squares each of those.

The final result:

[0, 9, 36, 81]

List Comprehensions in Haskell


In Haskell, the format looks slightly different:

[n * n | n <- [1..10]]

This generates squares of numbers from 1 to 10.

6th chapter 33
List Comprehensions in F#
F# also supports list comprehensions, and they can be used to create
arrays too.

Example:

let myArray = [| for i in 1 .. 5 -> (i * i) |];;

This creates an array with square values:

[1; 4; 9; 16; 25]

Lists in Java and C#


Java and C# provide List and ArrayList as part of their libraries.

These are generic collection classes, which means:

You can store any type of objects in them,

But all elements must be of the same type.

6.10 Union Types

What Are Union Types?


A union is a data type that allows a variable to store different types of
values at different times.

For example, imagine a compiler’s constant table. It stores various


constant values (like numbers, decimals, true/false).

It would be helpful if one field in the table could hold any of these
types, instead of making separate fields.

In this case, we use a union to create a variable that can hold either an
int ,a float , or a Boolean , one at a time.

So, we say this variable’s type is the union of these three types.

6th chapter 34
6.10.1 Design Issues
The main design challenge with union types is type checking.

Type checking means verifying if a value is of the expected type before


using it.

This issue is discussed more in Section 6.12.

6.10.2 Discriminated vs Free Unions


Languages like C and C++ have union constructs but do not check the
types when using them.

C/C++ Example:

union flexType {
int intEl;
float floatEl;
};

union flexType el1;

el1.intEl = 27;
float x = el1.floatEl; // No type checking happens here

Here, we store an integer in el1 , but later we read it as if it were a float.

The compiler doesn’t stop this, even though it’s incorrect — it just uses the
raw bits, which results in nonsense.

Unions like this are called free unions, because the programmer has
complete freedom and no safety (no checks are done).

Discriminated Unions
To make unions safer, some languages include a type indicator, called a
tag or discriminant.

A union with a tag is called a discriminated union.

The first language to introduce them was ALGOL 68.

6th chapter 35
Modern languages that support discriminated unions:

ML

Haskell

F#

6.10.3 Unions in F#
In F#, you create a union using the type keyword and | (OR) symbol to list
the possible types.

F# Example:

type intReal =
| IntValue of int
| RealValue of float

Here, intReal is the union type.

It can hold either an IntValue (an int) or a RealValue (a float).

You create values like this:

let ir1 = IntValue 17;;


let ir2 = RealValue 3.4;;

To read the value from a union, F# uses pattern matching with the match

keyword.

Example with Pattern Matching:

let a = 7;;
let b = "grape";;

let x = match (a, b) with


| 4, "apple" -> "apple"

6th chapter 36
| _, "grape" -> "grape"
| _ -> "fruit";;

match compares (a, b) to each pattern:

If a=4 and b = "apple" → result is "apple" .

If b = "grape" → result is "grape" .

Otherwise → result is "fruit" .

Printing the Union Type in F#:

let printType value =


match value with
| IntValue value -> printfn "It is an integer"
| RealValue value -> printfn "It is a float";;

If we call:

printType ir1;; // Output: It is an integer


printType ir2;; // Output: It is a float

6.10.4 Evaluation
Free unions (like in C/C++) are unsafe, because:

The system can’t tell which type the value actually is,

There’s no type checking, so errors can occur silently.

This is one reason C and C++ are not strongly typed languages.

In contrast, languages like ML, Haskell, and F# have safe union types,
using tags and pattern matching.

Java and C# do not support unions at all, likely because:

These languages focus on safety,

Unions are risky unless carefully managed.

6th chapter 37
6.10.5 Implementation of Union Types
When a union is stored in memory:

It uses the same memory location for all possible types.

Only one type can be stored at a time.

The memory size given to the union is large enough to hold the biggest
type that could be stored in it.

6.11 Pointer and Reference Types

What Are Pointers?


A pointer is a type of variable that stores a memory address instead of an
actual value.

The value stored in a pointer is usually the address of another variable, or


it can be nil (meaning the pointer is not pointing to anything valid right
now).

Why Are Pointers Useful?


Pointers are used in two main ways:
a. For indirect addressing — accessing memory through another variable.
b. For managing dynamic memory — memory that is allocated while the
program is running (from a special memory area called the heap).

When variables are created from the heap at runtime, they are called heap-
dynamic variables.

These variables often don’t have names.

They are also called anonymous variables.

You can only access them using pointers or references.

Pointers are not structured types like arrays or records.

Pointers are different from normal variables (called value types) because:

Pointers store addresses.

6th chapter 38
Value types store actual data.

Languages that don’t support pointers must simulate dynamic memory


management (e.g., for a binary tree) in complicated ways.

6.11.1 Design Issues with Pointers


When designing a language that uses pointers, some important questions
are:

What is the scope (visibility) and lifetime of a pointer?

What is the lifetime of the memory a pointer points to?

Can pointers point to values of any type, or only to specific types?

Should pointers be used for indirect access, dynamic memory, or


both?

Should the language have only pointer types, only reference types, or
both?

6.11.2 Pointer Operations


There are two key operations with pointers:

1. Assignment — setting the pointer to a memory address.

2. Dereferencing — accessing the value stored at the memory address


the pointer holds.

Example in C++:

int *ptr;
int x = 10;
ptr = &x; // Assignment: ptr now holds the address of x
int y = *ptr; // Dereferencing: y gets the value stored at that address (10)

ptr gives the value that ptr is pointing to.

&x gives the address of variable x .

Dereferencing can be:

6th chapter 39
Explicit (you write ptr in C/C++), or

Implicit (happens automatically in some languages).

You can also use pointer arithmetic in C/C++:

If ptr points to an array, you can do:

*(ptr + 1) == array[1]

This allows you to move through memory using math.

6.11.3 Problems with Pointers


Pointers were first introduced in the PL/I language.

While useful, they cause serious problems if not handled carefully.

6.11.3.1 Dangling Pointers


A dangling pointer happens when a pointer still holds the address of a
variable that has already been deleted (freed).

Example in C++:

int *p = new int[100];


int *q = p;
delete[] p; // memory is now deallocated
// q is now a dangling pointer

Accessing or modifying memory through a dangling pointer can crash the


program or cause incorrect results.

6.11.4 Heap Management


When variables are created at runtime (heap-dynamic), memory is taken
from the heap.

Languages must include tools to:

Allocate memory (e.g., malloc() in C, new in Java/C++),

6th chapter 40
Deallocate memory (e.g., free() in C, delete in C++).

If memory is not freed properly, it results in memory leaks (wasted


memory).

6.11.5 Reference Types


A reference is similar to a pointer but works differently in key ways:

A pointer refers to a memory address.

A reference refers directly to an object or value.

C++ Reference Example:

int x = 0;
int &ref = x; // ref is a reference to x
ref = 100; // sets x to 100

x and ref are now aliases — both refer to the same memory location.

In C++, references:

Must be initialized when declared.

Cannot be changed to refer to something else later.

Are always implicitly dereferenced (you don’t use to get the value).

Function Parameters with References:

Using references in function parameters allows two-way


communication.

Changes made to the reference in the function affect the original


variable.

6.11.6 References in Java, C#, Python, Ruby


In Java:

References are only used for objects (like instances of classes).

You can reassign a reference to point to another object.

6th chapter 41
There are no pointers in Java.

Memory is managed automatically — you don’t free memory manually,


so no dangling pointers.

In C#:

Supports both pointers (like C++) and references (like Java).

But pointers are discouraged and require special marking ( unsafe


keyword).

Most C# code uses references for safety and simplicity.

In Python, Ruby, and Smalltalk:

All variables are references.

You cannot access memory addresses directly.

Variables are always implicitly dereferenced — you never write ptr .

6.11.7 Implementation of Pointer and Reference


Types

6.11.7.1 How They Are Stored


On most modern computers, pointers and references are stored as single
values (memory addresses).

On older systems (like early Intel CPUs), an address had two parts:

A segment,

And an offset.

So, the pointer was stored as two 16-bit values.

6.11.7.2 Avoiding Dangling Pointers


Two techniques to avoid dangling pointers:

a. Tombstones

6th chapter 42
Add an extra pointer (tombstone) between the actual pointer and the
memory.

When memory is freed, tombstone is set to nil.

Any pointer still pointing to it will be detected as invalid.

Downside: slower performance and more memory usage.

b. Locks and Keys


Used in some compilers like UW-Pascal.

The pointer includes a key, and the memory has a lock.

When memory is deallocated, the lock and key don’t match anymore.

If they mismatch, the system knows the pointer is no longer valid.

6.12 Optional Types

What Are Optional Types?


In programming, sometimes you need a variable to have no value — for
example, when it’s not yet known or hasn’t been set.

Older programming languages sometimes used zero to mean “no value,”


but this caused problems:

You couldn’t tell if zero was a real value or just a placeholder for
“nothing.”

To solve this, newer languages offer optional types.

These are special types of variables that can either:

Hold a normal value, or

Hold a special “no value” indicator.

These are commonly used in:

C#

F#

6th chapter 43
Swift

Optional Types in C#
In C#, variables are divided into two categories:

1. Reference types (like classes)

2. Value types (like int , float , etc.)

Reference Types
Reference types are optional by default.

If they have no value, they are set to null .

Value Types
Value types are not optional by default.

But you can make them optional by adding a question mark ( ? ) after the
type.

Example:

int? x; // x is now an optional integer

Now, x can either:

Store a number like 5 , or

Store nothing, represented by null .

Checking if an Optional Variable Has a Value

if (x == null)
Console.WriteLine("x has no value");
else
Console.WriteLine("The value of x is: {0}", x);

This checks whether x holds a real number or is still null .

6th chapter 44
Optional Types in Swift
Swift has similar behavior but uses the keyword nil instead of null .

Declaring Optional in Swift:

var x: Int? // x is an optional integer

x can now either:

Store a number like 42 , or

Be nil (no value).

Checking if It Has a Value in Swift:

if x == nil {
print("x has no value")
} else {
print("The value of x is: \(x!)")
}

The ! is used to force access to the value if you're sure it's not nil .

Optional types help make programs more flexible and safe, by clearly
distinguishing between variables that have values and those that don’t yet.

6.13 Type Checking

What Is Type Checking?


In this topic, the idea of operands and operators is expanded:

A subprogram (function) is treated like an operator.

The parameters you pass to it are considered operands.

An assignment statement (like x = 10 ) is also seen as a binary operator,


where:

The left-hand side (LHS) (the variable like x ) is one operand,

6th chapter 45
And the right-hand side (RHS) (like 10 ) is the other operand.

Type checking is the process where the system makes sure the operands
used with an operator are of the right type.

What is a Compatible Type?


A compatible type is one that is either:
a. Allowed directly by the rules of the language for that operation, or
b. Can be automatically converted into a valid type using coercion.

What is Coercion?
Coercion is when the compiler or interpreter automatically changes the
type of a value so it fits correctly.

Example in Java:

int a = 5;
float b = 3.2f;
float c = a + b; // 'a' is coerced to float

Here, a is changed to float automatically so that both values are of the


same type before the addition.

What is a Type Error?


A type error occurs when an operator is used with a value (operand) of an
inappropriate or illegal type.

Example:

In early versions of C, if a function expected a float but was passed an


, the program would continue but produce incorrect results —
int

because the compiler did not check whether the types matched.

Static vs Dynamic Type Checking


✅ Static Type Checking
6th chapter 46
If all variable-type connections (called bindings) in a language are static
(i.e., known at compile time), then most type checking can be done before
the program runs.

This is called static type checking.

🔁 Dynamic Type Checking


If the types of variables are decided while the program runs, it’s called
dynamic type binding.

In this case, type checking must be done at runtime.

This is called dynamic type checking.

Examples of Languages:
Static type checking is used in:

Java

C++

C#

Ada

Dynamic type checking is used in:

JavaScript

PHP

Which Is Better?
It is better to detect errors at compile time than at runtime because:

Errors caught earlier are easier and cheaper to fix.

But, static type checking reduces programmer freedom:

You can’t use as many shortcuts.

Tricks and hacks that bypass type safety are not allowed.

Still, these limitations are considered good because shortcuts can:

Lead to bugs,

6th chapter 47
Make code harder to read and maintain.

When Type Checking Gets Hard


Type checking becomes difficult when a language allows a single memory
cell (i.e., variable) to store different types at different times.

This kind of behavior is seen with:

Union types in C and C++

Discriminated unions in ML, Haskell, F#

In such cases, even if a language does static type checking for most
variables:

Some type errors can still occur during execution.

Because the variable’s type changes dynamically, so it must be


checked at runtime.

Summary of Key Points:


Type checking ensures that variables and functions are used with the right
types.

Static type checking = done before running the program (more safe and
preferred).

Dynamic type checking = done while the program runs (more flexible, but
risky).

Coercion helps match types automatically but can hide errors.

Languages like C++, even though they use static typing, still can’t catch all
errors at compile time due to union types.

6.14 Strong Typing

What Is Strong Typing?


In the 1970s, a major improvement in programming languages was the idea
of strong typing.

6th chapter 48
Strong typing is now widely accepted as a very important feature in a
programming language.

But, unfortunately, the term “strong typing” is sometimes used without a


clear definition — it can mean different things in different contexts.

Simple Definition
A programming language is strongly typed if type errors are always
caught (either at compile time or run time).

For this to work, the system must be able to:

Always know the type of every operand, and

Check that types are being used correctly.

Why Is Strong Typing Important?


It helps the compiler or interpreter detect mistakes where variables are
used incorrectly.

For example:

If a program tries to add a number and a string, strong typing would


stop the program and report an error.

In languages that are strongly typed:

Even if a variable can hold different types at different times (like a


union),

The system must check the actual type before using the variable.

Languages That Are or Aren’t Strongly Typed

❌ Not Strongly Typed: C and C++


These languages allow union types — a single variable can hold values of
multiple types at different times.

But there is no built-in type checking for these unions.

6th chapter 49
So, it's possible to access the wrong type by mistake, and the system
won't stop you.

✅ Strongly Typed: ML and F#


These languages are strongly typed.

Even if the type of a function parameter is not known at compile time, the
language ensures that type errors cannot occur.

⚠️ Almost Strongly Typed: Java and C#


Java and C# are based on C++, but they improved type safety.

They are nearly strongly typed, but not 100%.

Why? Because:

They allow explicit casting — the programmer can manually change


one type into another.

If this casting is wrong, it can still cause a type error during runtime.

How Coercion Affects Strong Typing

What is Coercion?
Coercion is the automatic conversion of one type into another when
needed.

Example in Java:

int a = 5;
float b = 3.14f;
float result = a + b; // a is automatically converted to float

While coercion makes coding easier, it also reduces the power of strong
typing.

Why Is That a Problem?


Let’s say you meant to write:

6th chapter 50
result = a + b;

But by mistake, you wrote:

result = a + d; // d is a float, and you meant to use another int

Java will not give an error.

Instead, it will coerce a into a float, and continue as if everything is fine —


even though it was a bug!

So, even in a language that seems strongly typed (like Java), coercion can
make it less reliable.

Comparison of Language Coercion Levels

Language Coercion Level Strong Typing? Reliability

C, C++ Lots of coercion ❌ No ❌ Less reliable


Java, C# Moderate coercion ⚠️ Almost ⚠️ Better than C++
ML, F# No coercion ✅ Yes ✅ Very reliable
The more coercion a language allows, the less reliable it becomes in terms
of catching errors.

Languages like ML and F# are more trustworthy, because:

They don’t silently convert types, and

They always check that the type is correct.

Summary of Strong Typing


Strong typing = catching all type errors, either at compile time or at run
time.

Coercion can weaken strong typing by letting incorrect types slip through.

Java and C# try to balance convenience with safety.

C and C++ are flexible but unsafe due to unions and unchecked coercion.

6th chapter 51
6.15 Type Equivalence

What Is Type Equivalence?


Earlier, we learned about type compatibility while discussing type
checking.

Type compatibility rules tell us:

What types of values (operands) can be used with each operator.

What kind of type errors are possible in a programming language.

These rules are called compatibility rules because:

Sometimes an operand can be automatically converted to the correct


type.

This automatic conversion is known as coercion.

But for structured types (like arrays, records, and user-defined types),
coercion is rare or not allowed.

So instead of compatibility, we focus on type equivalence.

What Is Type Equivalence Exactly?


Two types are said to be equivalent if a value of one type can be used in
place of a value of another type without coercion.

Type equivalence is a stricter version of compatibility — it means they are


directly usable without any automatic conversion.

Why It Matters:
The rules of type equivalence affect:

How data types are designed,

What operations are allowed on values of those types.

One important result of two types being equivalent is:

You can assign a value from one variable to another, as if they are the
same type.

6th chapter 52
Two Main Approaches to Type Equivalence
There are two ways to define whether two types are equivalent:

1. Name Type Equivalence


Two types are considered equivalent if:

They are declared using the same name, or

They are declared in the same type definition.

Simple and easy to implement, but also more restrictive.

❗ Example in Ada:
type Indextype is 1..100;
count : Integer;
index : Indextype;

Here, even though both count and index are just ranges of numbers, they are
not considered equivalent, because they have different names.

Problems with Name Type Equivalence:


If a complex user-defined type (like a record) is passed around in a
program, it must be declared globally.

Local declarations with the same structure are not accepted — the
compiler checks names, not structure.

Also, name type equivalence only works if all types have names.

But many languages allow anonymous types (types without names).

So, for name type equivalence to work:

The compiler must give internal names to anonymous types.

2. Structure Type Equivalence


Two types are considered equivalent if their structure (layout) is the same,
even if they have different names.

6th chapter 53
✅ Pros:
More flexible.

Allows the use of values across different types as long as they look the
same internally.

❌ Cons:
Harder to implement.

The compiler must compare the entire structure of two types — not just
names.

❓ What makes it complicated:


Self-referencing structures (e.g., linked lists).

Different field names in records:

Are two records equal if the fields are the same type but have different
names?

Arrays with different subscript ranges:

Is an array 0..10 the same as one with 1..11 if both have the same number
of elements?

Enumeration types:

Are they the same if they have the same number of elements, but the
element names are different?

When Structure Equivalence Goes Wrong


It might allow wrong assumptions — for example:

type Celsius = Float;


type Fahrenheit = Float;

These two types have the same structure, but mean different things.

Structure type equivalence would treat them as the same — which can
cause logical errors in a program.

6th chapter 54
How Ada Handles This
Ada uses a restrictive form of name type equivalence but also provides
subtypes and derived types to solve problems.

Derived Types in Ada


A derived type is a new type based on an existing type.

It inherits all its operations, but it is not equivalent to the original.

❗ Example:
type Celsius is new Float;
type Fahrenheit is new Float;

Even though both are based on Float , they are not equivalent.

This avoids mixing temperature types accidentally.

A literal value like 3.0 is treated specially — it has a universal real type and
is considered equivalent to any floating-point type.

Subtypes in Ada
A subtype is a type that has restrictions (like a smaller valid range), but is
still equivalent to the original type.

✅ Example:
subtype Small_type is Integer range 0..99;

Small_type is equivalent to Integer .

Difference Between Derived Type and Subtype in Ada


Feature Subtype Derived Type

Based on Existing type Existing type

Type Equivalence ✅ Yes ❌ No

6th chapter 55
Feature Subtype Derived Type

Operation Access ✅ Inherits all operations ✅ Inherits all operations


Use with Parent ✅ Can be used as parent type ❌ Cannot be mixed with parent
Arrays in Ada: Special Case
Unconstrained Array Types

type Vector is array (Integer range <>) of Integer;


Vector_1 : Vector(1..10);
Vector_2 : Vector(11..20);

These two arrays are considered equivalent under structure type


equivalence:

Same element type ( Integer )

Same number of elements (10)

Constrained Anonymous Types

A : array (1..10) of Integer;


B : array (1..10) of Integer;

Even though they look the same, A and B are not equivalent.

They are both anonymous types — the compiler assigns separate internal
types to them.

❗ Multiple Declarations Don't Help:


C, D : array (1..10) of Integer;

C and D are also treated as having different types.

But if declared like this:

6th chapter 56
type List_10 is array (1..10) of Integer;
C, D : List_10;

Now C and D share the same type, because they use a named type.

How C and C++ Handle This


C uses both name and structure type equivalence.

For structs, enums, and unions, name type equivalence is used:

Each declaration creates a new, unique type.

For other types (like arrays), structure type equivalence is used.

C’s typedef does not create a new type — it just gives a new name to an old
type.

Exception: If two structs or unions are defined in different files, C allows


them to be treated as structurally equivalent.

C++ is more strict — no such exception exists.

Other Languages
Fortran and COBOL:

Don’t let users define and name their own types.

So, name type equivalence is not possible.

Java and C++ (OOP Languages):

Bring another layer of complexity — object compatibility and


inheritance-based typing.

This is covered in Chapter 12.

Final Thoughts
Type equivalence helps determine:

Whether two variables can interact (like in assignments or function


calls).

6th chapter 57
Some languages use strict naming rules, others focus on structure.

Choosing one over the other affects:

Safety

Flexibility

Ease of implementation

6.16 Theory and Data Types

What Is Type Theory?


Type theory is a field of study found in:

Mathematics

Logic

Computer Science

Philosophy

It started in mathematics in the early 1900s and later became a standard


part of logic.

Talking about type theory in general can be:

Complicated

Lengthy

Highly abstract (hard to understand without deep math background)

In Computer Science
Type theory is used in two main ways:

1. Practical Type Theory:

Deals with real programming languages

Focuses on how data types work in programs

2. Abstract Type Theory:

Based on advanced math

6th chapter 58
Involves things like:

Typed lambda calculus

Combinators

Existential types

Higher-order polymorphism

These abstract topics are beyond the scope of this textbook.

What Is a Data Type?


A data type is:

A set of values (like 1, 2, 3 for int )

A collection of operations you can perform on those values (like + ,,


etc.)

What Is a Type System?


A type system includes:

The set of data types available in a language

The rules for how these types can be used

Every programming language that uses types has its own type system.

Formal Type Systems


A formal model of a type system includes:

A set of types

A collection of functions that tell the compiler how to:

Check types

Figure out the type of any expression in the code

One formal system used in type checking is:

Attribute grammars (covered in Chapter 3)

Another model uses:

6th chapter 59
A type map

A set of type-checking functions

What Is a Type Map?


A type map is like a dictionary:

Each entry contains:

A variable name

The type of that variable

Example:

x : int
y : float
z : bool

In static-typed languages, the type map is built during compilation.

In dynamic-typed languages, it has to be kept during program execution.

How Type Maps Are Used


In compilers:

The symbol table stores the type map.

It is built by parts of the compiler called:

Lexical analyzer

Syntax analyzer

In dynamically typed languages:

Tags are used to store type information along with values.

Mathematical Background
A data type can be seen as a set.

For example: an enumeration type {Red, Green, Blue} is a set of colors.

This helps model data types using set operations like:

6th chapter 60
Union

Cartesian product

Subsets

Set Operations and Type Constructors


A type constructor is like a formula to build new types from existing ones.

These constructors often resemble set operations.

Finite Mapping
A finite mapping is a function that connects one value to another.

In programming:

Arrays and functions can be thought of as mappings.

Example:
In a simple array:

Index 3 maps to the 4th element of the array.

In associative arrays (like Python dictionaries):

A string key maps to a value using a hash function.

Cartesian Product (Tuples)


The Cartesian product of two sets S1 and S2 is the set of all pairs (x, y)
where:

x is from S1

y is from S2

Example:

S1 = {1, 2}
S2 = {a, b}
S1 × S2 = {(1, a), (1, b), (2, a), (2, b)}

This models:

6th chapter 61
Tuples in Python, ML, Swift, and F#

Records or structs (with field names added)

Example of Cartesian Product in C:

struct intFloat {
int myInt;
float myFloat;
};

This struct represents the Cartesian product of int × float .

Set Union
The union of two sets includes all elements from both sets.

This models union data types in programming (Section 6.10).

Subsets
A subset is a part of another set.

In Ada:

Subtypes are like subsets, but must include contiguous values (no
gaps).

Pointers and Set Theory


Pointers (like those in C) are not modeled using set operations.

They are defined using special symbols (like in C).

Conclusion
This section introduced some basic mathematical models behind data
types.

Understanding these models helps in:

Designing new types

6th chapter 62
Understanding how compilers work

Ensuring that type systems are sound and consistent

6th chapter 63

You might also like