0% found this document useful (0 votes)
19 views

Chapter-6 Data Types

Uploaded by

hamnahussain494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Chapter-6 Data Types

Uploaded by

hamnahussain494
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Chapter 6

Data Types

ISBN 0-321-33025-0
Primitive Data Types
• Almost all programming languages provide a set of
primitive data types
• Primitive data types: Those not defined in terms of
other data types

Copyright © 2006 Addison-Wesley. All rights reserved. 2


Primitive Data Types: Integer
• It is the most common primitive numeric data type.
• Java include four signed integer sizes; byte, short,
int, and long.

Copyright © 2006 Addison-Wesley. All rights reserved. 3


Primitive Data Types: Floating Point
• Languages for scientific use support at least two
floating-point types (e.g., float and double;
sometimes more

Copyright © 2006 Addison-Wesley. All rights reserved. 4


Primitive Data Types: Boolean
• Simplest of all
• Range of values: two elements, one for “true” and
one for “false”
• Could be implemented as bits, but often as bytes.

Copyright © 2006 Addison-Wesley. All rights reserved. 5


Primitive Data Types: Character
• Stored as numeric codings
• Most commonly used coding: ASCII
• An alternative, 16-bit coding: Unicode
– Includes characters from most natural languages
– Originally used in Java
– C# and JavaScript also support Unicode

Copyright © 2006 Addison-Wesley. All rights reserved. 6


Character String Types Operations
• Values are sequences of characters
• Typical operations:
– Assignment and copying
– Comparison (=, >, etc.)
– Catenation
– Substring reference
– Pattern matching

Copyright © 2006 Addison-Wesley. All rights reserved. 7


Character String Type in Certain Languages
• C and C++
– Not primitive
– Use char arrays and a library of functions that provide
operations

• SNOBOL4 (a string manipulation language)


– Primitive
– Many operations, including elaborate pattern matching

• Java
– Primitive via the String class

Copyright © 2006 Addison-Wesley. All rights reserved. 8


Character String Length Options
• Several design choices for the length of string:
1. Static:
– The length is static and set when the string is created, for example
Java’s String class

2. Limited Dynamic Length:


– The length is varying up to a declared and fixed maximum.
– In C-based language, a special character is used to indicate the end of
a string’s characters, rather than maintaining the length.

3. Dynamic (no maximum):


– The length is varying with no maximum.
– SNOBOL4, Perl, JavaScript
• Ada supports all three string length options.

Copyright © 2006 Addison-Wesley. All rights reserved. 9


Character String Implementation
• Static length: compile-time descriptor
• Limited dynamic length: may need a run-time
descriptor for length (but not in C and C++)
• Dynamic length: need run-time descriptor;
allocation/de-allocation is the biggest
implementation problem

Copyright © 2006 Addison-Wesley. All rights reserved. 10


Compile- and Run-Time Descriptors

Compile-time Run-time
descriptor for descriptor for
static strings limited dynamic
strings

Copyright © 2006 Addison-Wesley. All rights reserved. 11


Enumeration Types
• Provides a way of defining and grouping collections
of named constants, which are called enumeration
constants.
• C# example
enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};
• The enumeration constants are implicitly
assigned the integer values, 0, 1, …
• Design issues
– Is an enumeration constant allowed to appear in more than one type
definition, and if so, how is the type of an occurrence of that constant
checked?
– Are enumeration values coerced to integer?
– Any other type coerced to an enumeration type?
Copyright © 2006 Addison-Wesley. All rights reserved. 12
Evaluation of Enumerated Type
• Aid to readability, e.g., no need to code a color as a
number.
• Aid to reliability, e.g., compiler can check:
– Operations (don’t allow colors to be added)
– No enumeration variable can be assigned a value outside
its defined range.
– Ada, C#, and Java 5.0 provide better support for
enumeration than C++ because enumeration type variables
in these languages are not coerced into integer types.

Copyright © 2006 Addison-Wesley. All rights reserved. 13


Subrange Types
• An ordered contiguous subsequence of an ordinal
type
– Example: 12..18 is a subrange of integer type

• Ada’s design
type Days is (mon, tue, wed, thu, fri, sat, sun);
subtype Weekdays is Days range mon..fri;
subtype Index is Integer range 1..100;

Day1: Days;
Day2: Weekday;

Day2 := Day1; --legal as long as Day1 not Sat or Sun.

Copyright © 2006 Addison-Wesley. All rights reserved. 14


Subrange Evaluation
• Aid to readability
– Make it clear to the readers that variables of subrange can
store only certain range of values

• Reliability
– Assigning a value to a subrange variable that is outside the
specified range is detected as an error

Copyright © 2006 Addison-Wesley. All rights reserved. 15


Implementation of User-Defined Ordinal Types

• Enumeration types are implemented as integers


• Subrange types are implemented like the parent types
with code inserted (by the compiler) to restrict
assignments to subrange variables

Copyright © 2006 Addison-Wesley. All rights reserved. 16


Array Indexing
• Indexing (or subscripting) is a mapping from indices
to elements
array_name (index_value_list)  an element

• Index Syntax
– FORTRAN, PL/I, Ada use parentheses
– Most other languages use brackets

Copyright © 2006 Addison-Wesley. All rights reserved. 17


Arrays Index (Subscript) Types
• FORTRAN, C: integer only
• Pascal: integer, boolean, char, and enumeration
• Ada: integer, enumeration, boolean and char
• Java: integer only

Copyright © 2006 Addison-Wesley. All rights reserved. 18


Array Initialization
• Some language allow initialization at the time of
storage allocation
– C, C++, Java, C# example
int list [] = {4, 5, 7, 83}
– Character strings in C and C++
char name [] = “freddie”;
– Arrays of strings in C and C++
char *names [] = {“Bob”, “Jake”, “Joe”];
– Java initialization of String objects
String[] names = {“Bob”, “Jake”, “Joe”};

Copyright © 2006 Addison-Wesley. All rights reserved. 19


Arrays Operations
• APL provides the most powerful array processing operations
for vectors and matrixes as well as unary operators.
• Example: Consider the APL code
+/AxB
– Computes (A[1]xB[1]) + (A[2]xB[2]) + …
• Ada allows array assignment
• Fortran provides elemental operations
• For example, + operator between two arrays results in an
array of the sums of the element pairs of the two arrays.

Copyright © 2006 Addison-Wesley. All rights reserved. 20


Implementation of Arrays

• Access function maps subscript expressions to an


address in the array
• Access function for single-dimensioned arrays:
address(array1[k]) = address (array1[lower_bound])
+ ((k-lower_bound) * element_size)

Copyright © 2006 Addison-Wesley. All rights reserved. 21


Accessing Multi-dimensioned Arrays
• Two common ways:
– Row major order (by rows) – used in most languages
– column major order (by columns) – used in Fortran

Copyright © 2006 Addison-Wesley. All rights reserved. 22


Locating an Element in a Multi-dimensioned
Array (row major)
•General format
Location (a[i,j]) = address of a [row_lb,col_lb]
+ (((i - row_lb) * n) + (j - col_lb)) *
element_size

Where n is the number of elements per row.

Copyright © 2006 Addison-Wesley. All rights reserved. 23


Record Types

• A record is a possibly heterogeneous aggregate of


data elements in which the individual elements are
identified by names
• Design issues:
– What is the syntactic form of references to the field?
– Are elliptical references allowed

Copyright © 2006 Addison-Wesley. All rights reserved. 24


Examples
• COBOL uses level numbers to show nested records;
01 EMP-REC.
05 FIRST PIC X(20).
05 MID PIC X(10).
05 LAST PIC X(20).
05 HOURLY-RATE PIC 99V99.

• The EMP-REC record consists of four fields.


• The level numbers, such as 01 and 05, indicate the position of a data item
in a hierarchical structure of the data.
• The 01-level is called a record. The numbers 02-49 are available for
subdivision of a record. Gaps are usually left between the level numbers to
allow for ease in modifying the record structure.
• PIC X(n), called the picture clause, specifies n alphanumeric characters to
the filed.
• 99V99 specifies four decimal digits with the decimal point in the middle.

Copyright © 2006 Addison-Wesley. All rights reserved. 25


Examples
• Record structures are indicated in Ada as follows:
type Emp_Rec_Type is record
First: String (1..20);
Mid: String (1..10);
Last: String (1..20);
Hourly_Rate: Float;
end record;
Emp_Rec: Emp_Rec_Type;

Copyright © 2006 Addison-Wesley. All rights reserved. 26


References to Fields
• Record Field References
1. COBOL
field_name OF record_name_1 OF ... OF record_name_n
For example, the MID field in the above COBOL example
can be referenced with
MID of EMP-REC

2. Most of the others languages use the dot notation.


record_name_1. ... record_name_n.field_name
For example, the MID field in the above Ada example can be
referenced with
EMP-REC.MID

Copyright © 2006 Addison-Wesley. All rights reserved. 27


References to Records

• Fully qualified references must include all record


names.
• Elliptical references allow leaving out record names
as long as the reference is unambiguous, for example
in COBOL.
FIRST insteat of FIRST of EMP-REC is
elliptical references to the employee’s first name.

Copyright © 2006 Addison-Wesley. All rights reserved. 28


Operations on Records
• Assignment is very common.
• Ada allows record comparison for equality and
inequality.
• Ada records can be initialized.

Copyright © 2006 Addison-Wesley. All rights reserved. 29


Evaluation and Comparison to Arrays
• Records are used when collection of data values is
heterogeneous.
• Arrays are used when all the data values have the
same type
• Access to array elements is much slower than access
to record fields, because subscripts are dynamic
while field names are static.

Copyright © 2006 Addison-Wesley. All rights reserved. 30


Pointer and Reference Types
• A pointer type variable has a range of values that
consists of memory addresses and a special value,
nil.
• The value nil indicates that a pointer cannot currently
be used to reference any memory cell.
• Pointers, unlike arrays and records, are not structured
types.
• Very useful to implement dynamic data structures,
such linked lists and trees.

Copyright © 2006 Addison-Wesley. All rights reserved. 31


Pointer Operations
• Two fundamental operations: assignment and
dereferencing.
• Assignment is used to set a pointer variable’s value
to some useful address.
• Dereferencing yields the value stored at the location
represented by the pointer’s value
– Dereferencing can be explicit or implicit
– Fortran 95 uses implicit dereferncing.
– C++ uses an explicit operation via *
j = *ptr sets j to the value located at ptr

Copyright © 2006 Addison-Wesley. All rights reserved. 32


Pointer Assignment Illustrated

The assignment operation j = *ptr sets j to 206

Copyright © 2006 Addison-Wesley. All rights reserved. 33


Problems with Pointers
• Dangling pointers (dangerous)
– A pointer points to a heap-dynamic variable that has been
deallocated.
– It could be created by the following sequence of operations:
• Pointer p1 is set to point to a new heap-dynamic
variable
• Pointer p2 is assigned to p1’s value.
• The heap-dynamic variable pointed to by p1 is explicitly
deallocated (setting p1 to nil), but p2 is not changed by
the operation. P2 is now a dangling pointer

Copyright © 2006 Addison-Wesley. All rights reserved. 34


Problems with Pointers
• Lost heap-dynamic variable
– An allocated heap-dynamic variable that is no longer
accessible to the user program (often called garbage)
– It could be created by the following sequence of
operations:
• Pointer p1 is set to point to a newly created heap-
dynamic variable
• Pointer p1 is later set to point to another newly created
heap-dynamic variabl.

Copyright © 2006 Addison-Wesley. All rights reserved. 35


Pointer Arithmetic in C and C++
float stuff[100];
float *p;
p = stuff;

*(p+5) is equivalent to stuff[5] and p[5]


*(p+i) is equivalent to stuff[i] and p[i]

Copyright © 2006 Addison-Wesley. All rights reserved. 36


Summary
• The data types of a language are a large part of what
determines that language’s style and usefulness
• The primitive data types of most imperative languages
include numeric, character, and boolean types
• The user-defined enumeration and subrange types are
convenient and add to the readability and reliability of
programs.
• Arrays and records are included in most languages.
• Pointers are used for addressing flexibility and to control
dynamic storage management.

Copyright © 2006 Addison-Wesley. All rights reserved. 37

You might also like