0% found this document useful (0 votes)
13 views43 pages

Chapter 4

Chapter 4 discusses type systems in programming, explaining how they assign types to program constructs and the rules governing type equivalence, compatibility, and inference. It outlines the differences between static and dynamic typing, as well as strong and weak typing, and categorizes types into scalar and composite types. The chapter also delves into type checking, conversion, and the significance of type systems in ensuring semantic correctness in programming languages.

Uploaded by

halisadam391
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views43 pages

Chapter 4

Chapter 4 discusses type systems in programming, explaining how they assign types to program constructs and the rules governing type equivalence, compatibility, and inference. It outlines the differences between static and dynamic typing, as well as strong and weak typing, and categorizes types into scalar and composite types. The chapter also delves into type checking, conversion, and the significance of type systems in ensuring semantic correctness in programming languages.

Uploaded by

halisadam391
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

CHAPTER – 4

Type Systems
Introduction
• A type system is a logical system containing a set of rules that assigns a property
called a type to the various constructs of a computer program.
• Tells the compiler or interpreter how the programmer intends to use the data.
• Computer hardware can interpret bits in memory in several different ways: as
• instructions,
• Some Dynamically typed language (type is associated
• addresses,
with run-time values) is Untyped.
• characters, E.g javascript ( var point=100), Assembly language

• integer and
• floating-point numbers of various lengths.
• The bits are untyped (that do not make you define the type of a variable)
• A variable can hold a value of any data type
Cont’d
A type system consists of:
1. A mechanism to define types and associate them with certain language
constructs:
• The constructs that must have types are precisely that have values, or that can
refer to objects that have values. These includes
o named constants (const int x=5), What has a type?
 things that have values
o variables,
o record fields, double a = 1.0;
Person p = new Student("John");
o parameters, sometimes subroutines;
o literal constants (e.g., 17 , 3.14 , "foo" ); and
o more complicated expressions containing these.
Cont’d
A type system consists of:
2. A set of rules for type equivalence, type compatibility, and type inference.
• Type equivalence rules determine when the types of two values are the
same.
 Structural equivalence: two types are the same if they consist of the same components.

• Type compatibility rules determine when a value of a given type A can be


used in a given context that expects type B.
• Type inference rules define the type of an expression based on the types of its
constituent parts or (sometimes) the surrounding context.
a : int b : int
--------------------
a + b : int
Cont’d
Most programming languages include a notion of type for expressions and/or
objects.
Types serve several important purposes:
1. Type provides implicit context for many operations :
Examples
• “+” is concatenation for Strings vs. integer summation for integers, etc.
• In C language a+b will use integer addition if a and b are of ( int ) type, it
will use floating-point addition if a and b are of double (floating-point) type.
• In Pascal the new p, where p is a pointer, will allocate a block of storage from
the heap that is the right size to hold an object of the type pointed to by p.
 The programmer does not have to specify (or even know) this size.
• In C++, Java, and C#,
Cont’d
new my_type()
 allocates (and returns a pointer to) a block of storage sized for an object of type
my_type
 it also automatically calls any user-defined initialization (constructor)
function that has been associated with that type.
2. Type limit the set of operations that may be performed in a semantically valid
program :
• Make sure that certain meaningless operations do not occur (type
checking ).
Examples
• Prevent the programmer from adding a character and a record.
• passing a file as a parameter to a subroutine that expects an integer.
Type checking
• Type checking is the process of ensuring that a program obeys the language’s type
compatibility rules.
• A violation of the rules is known as a type clash.
• Type checking cannot prevent all meaningless operations.
o It catches enough of them to be useful
• Strong Typing : like structured programming
o The language prevents you from applying an operation to data on which it is
not appropriate
o This was Pascal's primary selling point in the 1980s
o C,Java, C++, RubyRail, smalltalk, python.
• Weak Typing: unlike, types cause conversions ( allow a program which contain type
Type checking Cont’d
STATIC TYPING
• Compiler can do all the checking at compile time
• A Pascal implementation can also do most of its type checking at compile time
DYNAMIC TYPING - Types wait until runtime
• A Form of late binding, and found in languages that delay other issues until run time .
• Static typing is intended for performance; dynamic typing is intended for ease of
programming.
• programmer are free from most concern about data type & need extra storage to keep type
information during execution.
• Lisp, Small talk & Most scripting languages (e.g., Python & Ruby) are dynamically
typed.
Type checking Cont’d
• Java is strongly typed, with a non-trivial mix of things that can be checked statically and things that have
to be checked dynamically
(for instance, for dynamic binding):
o String a = 1; //compile-time (static) error
o int i = 10.0; //compile-time (static) error
• Python is strong dynamic typed:
o a = 1;
Strong-static: errors at compile time
o b = "2";
Strong-dynamic: errors at runtime
o a + b run-time error
Weak-dynamic: some potential errors
• Perl is weak dynamic typed: at runtime, BUT approximations in
o $a = 1
many cases.
o $b = "2"
o $a + $b no error / conversión
Type checking Cont’d

C and C++ are considered weakly typed


since, due to type-casting/implicit
conversation
The Meaning of “Type”
Three of the most popular notion of what is meant by type are :
1) Denotational Approach :
o A type is simply a set of values (domain). e.g., the byte domain is: {0, 1, 2, ... 255})
o A value has a given type if it belongs to the set.

Example : set of integer , enum hue {red, green, blue} in C


2) Structural Approach/ constructive
o Pioneered by Algol W and Algol 68
o Type is either one of a small collection of built-in types (integer, character,
Boolean, real, etc.; also called primitive or predefined types), or
o A composite type created by applying a type constructor ( record , array , set , etc.)
to one or more simpler types.
The Meaning of “Type”
Cont’d

Three of the most popular notion of what is meant by type are :


3) Abstraction-based Approach
o Pioneered by Simula-67 and Smalltalk, and is characteristic of modern
object-oriented languages.
o OO thinking
o Type is an interface consisting of a set of operations with well-defined and
mutually consistent semantics.
Classification of Types
1) Scalar types – one-dimensional, also sometimes called simple types.
• Discrete types (ordinal types) – countable/finite in implementation
such as: integer, Boolean, char, enumerations and subranges
• Rational
• Real
• complex
2) Composite (non-scalar) types: created by applying a type constructor to one
or more simpler types.
Such as: Records, arrays, sets, pointers, lists, files
Cont’d
Classification of Types
Booleans :
• Implemented as single-byte ,1 representing true and 0 representing false.
• C omits a Boolean type.(uses zero for false and anything else for true)
• Icon replaces Booleans with a more general notion of success and failure.

Characters :
• Traditional ASCII encoding : 1 byte
• recent languages (e.g.,Java and C#) use a two-byte representation designed to
accommodate (the commonly used portion of) the Unicode character set.
• Fortran 2003 supports four-byte Unicode characters.
Classification of Types
Cont’d

Numeric Types :
• A few languages (e.g., C and Fortran) distinguish between different lengths of
integers (-2.-1,0,1,2) and real numbers ( can include fractional component).
• Most do not, and leave the choice of precision /accuracy to the implementation.
• Differences in precision across language implementations lead to a lack of
portability.
• Java and C# providing several lengths of numeric types, with a specified
precision for each.
Classification of Types Cont’d
Enumeration Types :
• Enumerations were introduced by Wirth in the design of Pascal.
• An enumeration type consists of a set of named elements.

• The values of an enumeration type are ordered, so comparisons are generally valid
( mon < tue )
• There is usually a mechanism to determine the predecessor or successor of an
enumeration value. (in Pascal, tomorrow := succ(today)).

enum weekday {sun, mon, tue, wed, thu, fri, sat}; typedef int weekday;
const weekday sun = 0, mon = 1, tue = 2,
wed=3,thu=4,fri=5,sat=6;
Classification of Types Cont’d
Subrange Types :
• First introduced in Pascal, and are found in many subsequent languages.
• A subrange is a type whose values compose a contiguous subset of the values of
some discrete base type (also called the parent type).
• 12..14 a subrange of integer type
pascal

Ada

The range... portion of the definition in Ada is called a type constraint.


In this example test_score is a derived type. The workday Type is a constrained subtype;
Classification of Types Cont’d

Records (structs) :
• Introduced by Cobol.
• A record consists of collection of fields which belongs to a (potentially different)
simpler type.
• A record type corresponds to the Cartesian product of the types of the fields.
Each field has its own type:
struct MyStruct {
boolean ok;
There is a way to access the field:
int bar; foo.bar;<- C, C++, Java style, F-logic path expressions
};
MyStruct foo; bar of foo<- Cobol/Algol style
Classification of Types Cont’d

Variant records (unions) :


• Provide two or more alternative fields or collection of fields, but only one of a variant
record’s fields (or collections of fields) is valid at any given time.
• A variant record type is the union of its field types, rather than their Cartesian product.
• Variant records (a and b take up the same memory, saves memory, but usually unsafe,
tagging can make safe again):
• Lack of tag means you don't know what is there.
union {
int a;
float b;
}
Cont’d
Classification of Types Cont’d
Arrays :
• Most commonly used composite types.
• Can be thought of as a function that maps members of an index type to members
of a component type.
• Arrays of characters are often referred to as strings.
Sets :
• Introduced by Pascal.
• A set type is the mathematical powerset of its base type.
• A variable of a set type contains a collection of distinct elements of the base type.
o contains distinct elements without order, e.g.Pascal supports sets of any discrete type
Classification of Types Cont’d
Pointers :
• A pointer value is a reference to an object of the pointer’s base type.
• Pointers are often but not always implemented as addresses.
• They are most often used to implement recursive data types.
Lists :
• like arrays, contain a sequence of elements, but there is no notion of mapping or
indexing, accessing with link.
• A list is defined recursively.
• To find a given element of a list, a program must examine all previous elements,
recursively or iteratively, starting at the head.
• Lists are always of variable length and can store elements of different types
Type Equivalence
• When are the types of two values the same?
• In a language in which the user can define new types, there are two principal ways of
defining type equivalence:
o structural equivalence and
o name equivalence
• Name equivalence : is based on the lexical occurrence of type definitions: roughly
speaking, each definition introduces a new type.
• Structural equivalence : is based on the content of type definitions: roughly
speaking, two types are the same if they consist of the same components,put together
in the same way.
Type Equivalence Cont’d

• Name equivalence is more rigorous.


• If the programmer makes the effort to name 2 types differently, the programmer
probably wants them to be treated as different, even if their structures are identical.
• Structural equivalence is used in Algol-68, Modula-3, and (with various
wrinkles) C and ML.
• Name equivalence appears in Java, C#, standard Pascal, and most Pascal
descendants, including Ada.
Type Equivalence
Cont’d

In a similar vein, consider the following arrays,


again in a Pascal-like notation:

type str = array [1..10] of char;


type str = array [0..9] of char;

Here the length of the array is the same in both cases,


but the index values are different.
Should these be considered equivalent?
Most languages say no, but some
(including Fortran and Ada) consider them

Should the reversal of the order of the fields change compatible.

the type? ML says no; most languages say yes.


Type Equivalence
Cont’d

• Most programmers would probably want to be


informed if they accidentally assigned a value
of type school into a variable of type student ,
• but a compiler whose type checking is
based on structural equivalence will blithely
accept such an assignment.
• In this example, variables x and y will be
considered to have different types under
name equivalence: x uses the type declared
at line 1; y uses the type declared at line 4.
Variants of Name Equivalence

• Here new_type is said to be an alias for old_type .


• Should we treat them as two names for the same type, or as names for two
different types that happen to have the same internal structure?
• The “right” approach may vary from one program to another.
Cont’d
Variants of Name Equivalence

• A language in which aliased types are considered distinct is said to have strict
name equivalence.
• A language in which aliased types are considered equivalent is said to have loose
name equivalence.
• Most Pascal-family languages use loose name equivalence
• Under strict name equivalence, a declaration type A=B is considered a definition.
• Under loose name equivalence it is merely a declaration; A shares the definition
of B.
Cont’d
Variants of Name Equivalence
Under strict name equivalence
• line 3 is both a declaration and a definition, and blink is a
new type, distinct from alink.
• p and q have the same type, because they both use type
definition on the right-hand side of line 4.

Under loose name equivalence


• line 3 is just a declaration; it uses the definition at line 2.
• r, s, and u all have the same type,

Under structural equivalence, all six of the variables shown have the same
type,
namely pointer to whatever cell is.
Type Conversion and Casts
Converting one type to another (casting) is required when:
• Types are structurally equivalent, but the language uses name equivalence.
o The conversion is only conceptual, not physical /no code will need to be executed at
run time.
• The types have different but intersecting sets of values (e.g., one is a subrange of the
other)
o Runtime check tests the validity of the conversion
• Types are physically different, but values of one type correspond to values of the other
e.g., all integers can be represented as reals
Non-converting cast
• Treat a variable of one type as another type, without changing the physical
representation
Type Compatibility
• Most languages do not require equivalence of types in every context.
• A value’s type must be compatible with that of the context in which it appears.
• In an assignment statement, the type of the right-hand side must be compatible
with that of the left-hand side.
• The types of the operands of + must both be compatible with some common type
that supports addition (integers, real numbers, or perhaps strings or sets).
• In a subroutine call, the types of any arguments passed into the subroutine must
be compatible with the types of the corresponding formal parameters
Type Compatibility Cont’d
• The definition of type compatibility varies greatly from language to language.
• Ada takes a relatively restrictive approach:type S is compatible with an type T if
and only if
 S and T are equivalent,
 one is a subtype of the other (or both are subtypes of the same base type), or
 both are arrays, with the same numbers and types of elements in each
dimension.
• Pascal was only slightly more lenient:
➢ In addition to allowing the intermixing of base and subrange types, it
allowed an integer to be used in a context where a real was expected.
Coercion
• Automatic, implicit type conversion
• When an expression of one type is used in a context where a different type is
expected, one normally gets a type error
• Many languages allow things like this, and COERCE an expression to be of the
proper type
• Fortran has lots of coercion, all based on operand type
• C has lots of coercion, too, but with simpler rules:
 all floats in expressions become doubles
 short int and char become int in expressions
Coercion Cont’d
In effect, coercion rules are a relaxation of type checking
• Languages such as Modula-2 and Ada do not permit coercions
• C++, however, provides programmer-extensible coercion rules
 They're one of the hardest parts of the language to understand
Type Inference
what determines the type of the overall expression?
• Answer : The result of an arithmetic operator usually has the same type as
the operands (possibly after coercing one of them, if their types were not
the same).
• The result of a comparison is usually Boolean.
• The result of a function call has the type declared in the function’s
header.
• The result of an assignment has the same type as the left-hand side.
Arrays
• The areas of memory of the same type.
• Arrays are the most common and important composite data types
• Unlike records, which group related fields of different types, arrays are usually
homogeneous
• Semantically, arrays can be thought of as a mapping from an index type to a
component or element type
• Usually the only operations permitted are selection of an element and
assignment, however
 Fortran 90 offers many array operations supporting matrix algebra
 Ada and Fortran 90 allow arrays to be compared for equality
Arrays Cont’d
Dimensions, Bounds, and Allocation
• global lifetime, static shape — If the shape of an array is known at compile
time, and if the array can exist throughout the execution of the program, then
the compiler can allocate space for the array in static global memory
• local lifetime, static shape — If the shape of the array is known at compile
time, but the array should not exist throughout the execution of the program,
then space can be allocated in the subroutine’s stack frame at run time.
• arbitrary lifetime, shape bound at elaboration time— In Java and C# an
array is a reference to an object, whose space is allocated on the heap.
Arrays Cont’d
Possible layouts of memory for Contiguous elements:
 Row-major and Column-major:
storing multidimensional arrays in linear memory
Example: int A[2][3] = { {1, 2, 3}, {4, 5, 6} };
column major – Used in Fortran, MATLAB, GNU Octave,
R, Rasdaman, X10 and Scilab
column major : A is laid out contiguously in
linear memory as: 1 2 3 4 5 6
offset = row + column * NUMROWS
Example: A[1][1] (5)
offset = 1 + 1 * 2 = 3
Arrays Cont’d
o Row-major: A is laid out contiguously in linear memory as: 1 2 3 4 5 6
offset = row * NUMCOLS + column
Example: A[1][1] (5)
offset = 1 * 3 + 1 = 4

• used in C, PL/I, Python


Arrays Cont’d
Consider the array:
int A[3][4] = {{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}};
 Row-major: 1 2 3 4 5 6 7 8 9 10 11 12
 the offset for the element A[1][3] (8)
 row*NUMCOLS + column = 1*4+3 = 7
 Column-major: 1 5 9 2 6 10 3 7 11 4 8 12
 the offset for the element A[1][3] (8)
 row + column*NUMROWS = 1+3*3 = 10
Arrays
Row pointers:
• Allows rows to be put anywhere - nice for big arrays on
machines with segmentation problems
• Nice for matrices whose rows are of different lengths
• e.g. an array of strings
• Avoids multiplication for offset calculation
• Requires extra space for the pointers
Row pointers memory layout:
End of chapter - 4

You might also like