0% found this document useful (0 votes)
3 views

Unit II - Structuring Data, Computations and Programming

Unit II discusses the structuring of data and computations in programming, covering topics such as built-in and primitive data types, type systems, and user-defined types. It explains how programming languages classify data, the importance of data aggregation, and the concept of abstract data types, particularly in C++. The document emphasizes the role of types in organizing data, protecting against improper manipulations, and facilitating the creation of complex data structures.

Uploaded by

rajatchaudhari54
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Unit II - Structuring Data, Computations and Programming

Unit II discusses the structuring of data and computations in programming, covering topics such as built-in and primitive data types, type systems, and user-defined types. It explains how programming languages classify data, the importance of data aggregation, and the concept of abstract data types, particularly in C++. The document emphasizes the role of types in organizing data, protecting against improper manipulations, and facilitating the creation of complex data structures.

Uploaded by

rajatchaudhari54
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

UNIT II - Structuring Data, Computations and

Programming
Structuring of Data- Built in and primitive types, Data
aggregates and type constructors, Cartesian product,
Finite mapping User -defined types and abstract data
types,
Type systems, Static versus dynamic program checking,
Strong typing and type checking, Type compatibility,
Type conversions, Types and subtypes, Generic types,
monomorphic versus polymorphic type systems,
Structuring of Computations: Structuring the
computation, Expressions and statements,
Conditional execution and iteration, Routines, Style
 Computer programs can be viewed as functions that are applied to values of
certain input domains to produce results in some other domains.
 Data structuring gives immense strength throughout the programming and
programs.
 i.e. nothing but the functions are evaluated through a sequence of steps that
produce intermediate data that stored in program variables.
 Languages do so by providing features to describe data, the flow of
computation, and the overall program organization.
 Programming languages organize data through the concept of type. Classify
data according to different categories.
 A type is thus more properly defined as a set of values and a set of operations
that can be used to manipulate them.
 For example, the type BOOLEAN of languages like Ada and Pascal consists of
the values TRUE and FALSE;
 Boolean algebra defines operators NOT, AND, and OR for BOOLEANs.
 BOOLEAN values may be created, for example, as a result of the application of
relational operators (<, ð, >, Š, +, ¦) among INTEGER expressions.
Elementary Data Types
 An elementary data types contains a single data values and class of such
data objects over which various operations.
 Some elementary data types: Integer, real, character, Boolean,
enumeration and pointer.
1. Attributes – Basic attributes of any data object, such as data type and
name are usually invariant during its lifetime.
2. Values - The type of data object determines the set of possible values
that it
may contain.
3. Operations – The set of operations defined by language is basically refers
that how data object of that data type may be manipulated.
Built-in types and primitive types
 Any programming language is equipped with a finite set of built-in types
(or predefined) types, which normally reflect the behavior of the
underlying hardware.
 At the hardware level, values belong to the untyped domain of bit strings,
which constitutes the underlying universal domain of computer data.
 At the hardware level, a type may thus be considered as a view under
which data belonging to the universal type may be manipulated.
 As an example,the bit string "01001010" might be interpreted as integer
"74" (coded intwo’s complement representation) when it is the argument
of the machineinstruction ADD (which does integer addition).
 However, it would be interpreted as a bit string by the machine instruction
CPL (which does bitwise complement).
 It might be interpreted as ASCII character "I" if printed by instruction PCH
(which prints an ASCII character).
 The built-in types of a programming language reflect the different views
provided by typical hardware. Examples of built-in types are:
 • booleans, i.e., truth values TRUE and FALSE, along with the set of operations
defined byBoolean algebra;
• characters, e.g., the set of ASCII characters;
• integers, e.g., the set of 16-bit values in the range <-32768, 37767>; and
• reals, e.g., floating point numbers with given size and precision.
 Primitive data types may refer to the standard data types built into a
programming language (built-in types).
 Primitive is the most fundamental data type usable in the Programming
language.
 There are eight primitive data types: Boolean, byte, character, short, int,
long, float, and double.
 In a Programming language, these data types serve as the foundation for data
manipulation.
 All basic data types are built-in into the majority of programming languages.
 Primitive data types may or may not have a one-to-one correspondence with
objects in the computer's memory, depending on the language and its
implementation.
 For example, integer addition can be performed as a single machine
instruction, and some processors provide specific instructions for processing

 character sequences with a single instruction.


 The C standard specifically states that "a 'plain' int object has the natural size
suggested by the execution environment's architecture".
 On a 32-bit architecture, this means that int will most likely be 32 bits long.
Value types are always basic primitive types.
 Most programming languages do not allow programmes to change the
behaviour or capabilities of primitive (built-in or basic) data types.
 Such data types serve a single purpose: they contain pure, simple values of a
type.
 Because these data types are defined by default in the Programming
languages type system, they come with a set of predefined operations. Such
 Built-in types can be viewed as a mechanism for classifying the data
manipulated by a program.
 Moreover, they are a way of protecting the data against forbidden, or
nonsensical, maybe unintended, manipulations of the data.
 following are advantages of built-in types:
1. Hiding of the underlying representation.
2. Correct use of variables can be checked at translation time.
3. Resolution of overloaded operators can be done at translation time.
4. Accuracy control.
 Some types can be called primitive (or elementary). That is, they are not built
from other types.
 Their values are atomic, and cannot be decomposed into simpler constituents.
 In most cases, built-in types coincide with primitive types, but there are
exceptions.
 For example, in Ada both Character and String are predefined. Data of type
String have constituents of type Character, however. In fact, String is
predefined as:
type String is array (Positive range <>) of Character
 It is also possible to declare new types that are elementary. An example is given
by enumeration types in Pascal, C, or Ada. For example, in Pascal one may write:

type color = (white, yellow, red, green, blue, black);


 The same would be written in Ada as
type color is (white, yellow, red, green, blue, black);
 Similarly, in C one would write:
 In Computer science, Primitive data type is either of the following:
1. A basic type is a data type provided by a programming language as a basic
building block, Most languages allow more complicated composite types to
be recursively constructed starting from basic types.
2. A built-in type is a data type for which the programming language provides
built-in support.
Common data types include:
 Integer
 String
 Character
 Floating-point number
 Boolean
Data aggregates and type constructors
 Data aggregation is a process of compiling typically large amount of
information from given database and organizing it into a more
consumable and compressive medium.
 Data aggregation is the process of collecting data to present it in
summary form.
 This information is then used to conduct statistical analysis and can also
help company executives make more informed decisions about
marketing strategies, price settings, and structuring operations, among
other things.
 Data aggregation is typically performed on a large scale via software
programs known as data aggregators.
 Data scientists and analysts are the most common users of data
aggregation tools and data aggregation software.
 Programming languages allow the programmer to specify aggregations of
elementary data objects and, recursively, aggregations of aggregates.
 They do so by providing a number of constructors.
 A constructor is a special initialization function that is automatically called
whenever a class is declared.
 The resulting objects are called compound objects.
 well-known example is the array constructor, which constructs aggregates
of homogeneous-type elements.
 According to such languages, constructors can be used to define both
aggregate objects and new aggregate types.
 Routines can also be seen as constructors which allow elementary
instructions to be combined to form new operations.
1. Cartesian product
 The Cartesian product of n sets A1, A2, . . ., An, denoted A1 x A2 x . . . x An, is
a set whose elements are ordered n-tuples (a1, a2, . . ., an), where each ak
belongs to Ak.
 For example, regular polygons might be described by an integer–the number of
edges–and a real–the length of each edge.
 A polygon would thus be an element in the Cartesian product integer x real.
 Programming languages view elements of a Cartesian product as composed of
a number of symbolically named fields.
 In the example, a polygon could be declared as composed of an integer field
(no_of_edges) holding the number of edges and a real field (edge_size) holding
the length of each edge.
 Examples of Cartesian product constructors in programming languages are
structures in C, C++, Algol 68 and PL/I, records in COBOL, Pascal, and Ada.
 COBOL was the first language to introduce Cartesian products, which proved to
be very useful in data processing applications.
 For example, in a payroll transaction, employees are described by an n-tuple of
attributes (such as name, address, social security number, salary, etc.),
 some of which–in turn–may be described by an n-tuple of attributes (e.g., an
address is composed of street name, number, city, state, and zip code). Such an
aggregation can be described by a record.
 As an example of a Cartesian product constructor, consider the following C
declaration, which defines a new type reg_polygon and two objects a_pol and b_pol;
 struct reg_polygon { // The two regular polygons pol_a and pol_b
int no_of_edges; are initialized as two equilateral triangles float
edge_size; whose edge is 3.45.
};
struct reg_polygon pol_a, pol_b = {3, 3.45};
 The notation {3, 3.45} is used to implicitly define a constant value (also called a
compound value) of type reg_polygon (the polygon with 3 edges of length 3.45).
 pol_a.no_of_edges = 4;
 to make pol_a quadrilateral. This syntactic notation for selection, which is common
in programming languages, is called the dot notation.
2. Finite mapping
 A finite mapping is a function from a finite set of values of a domain type DT
onto values of a range type RT.
 Such function may be defined in programming languages through the use of
the mechanisms provided to define routines.
 This would encapsulate in the routine definition the law associating values
of type RT to values of type DT. This definition is called intensional.
 In addition, programming languages, provide the array constructor to define
finite mappings as data aggregates. This definition is called extensional.

 For example, the C declaration


char digits [10];
 defines a mapping from integers in the subrange 0 to 9 to the set of
characters, although it does not state which character corresponds to each
element of the subrange.
 The following statements
for (i = 0; i < 10; ++i)
digits [i] = ’ ’;
 define one such correspondence, by initializing the array to all blank characters.
 This example also shows that an object in the range of the function is
selected by indexing, that is, by providing the appropriate value in the
domain as an index of the array.
 Thus the C notation digits [i] can be viewed as the application of the mapping
to the argument i.
 For example, in Pascal, it is possible to declare
var x: array [2. .5] of integer;
 which defines x to be an array whose domain type is the subrange 2. .5.
 As another example of Pascal, having defined a type
computer_manufacturer by enumeration
type computer_manufacturer = (ibm, dec, hp, sun, apple,
 one may use the array type constructor to define the following new type to
represent data about each computer manufacturer
 type c_m_data = array [computer_manufacturer] of integer
 and then the following data objects
var c_m_profits, c_m_employees: c_m_data;
 For example, c_m_employees[hp] would give the number of employees of computer
manufacturer hp.
 If only the data regarding profits are needed, one could simply define an array data
aggregate instead of defining a new type, of which many instances can be
generated:
var c_m_profits: array [computer_manufacturer] of integers;
 Languages that allow variables to be initialized when they are declared may
also provide a way to initialize array objects.
 For example, in C arrays may be initialized through a compound value, as shown by
the following example
char digits [10] = {’ ’, ’ ’, ’ ’, ’ ’, ’ ’};
 where {’ ’, ’ ’, ’ ’, ’ ’, ’ ’} is a compound value of type "array of 5 characters."
 Similarly, in Ada one might write
X: array (INTEGER range 2. .6) of INTEGER := (0, 2, 0, 5, -
33);
 to define an array whose index is in the subrange 2. .5, where X(2) = 0,
X(3) = 2, X(4) = 0, X(5) = 5, X(6) = -33.
 Ada uses brackets "(" and ")" instead of "[" and "]“ to index arrays. This
makes indexing an array syntactically identical to calling a function.
User-defined types and abstract data types
 Modern languages also allow aggregates built through composition of built-
in types to be named as new types.
 Having given a type name to an aggregate data structure, one can declare
as many variables of that type as necessary by simple declarations.
 For example, after the C declaration which introduces a new type name
complex
struct complex {
float real_part, imaginary_part; }
 any number of instance variables may be defined to hold complex values:
 complex a, b, c, . . .;
 The ability to define a type name for a user defined data structure is only a
first step in the direction of supporting data abstractions.
 the two main benefits of introducing types in a language are classification
and protection.
 Types allow the (otherwise unstructured) world of data to be organized as a
collection of different categories.
 Types also allow data to be protected from undesirable manipulations by
specifying exactly which operations are legal for objects of a given type and
by hiding the concrete representation.
 these two properties, only the former is achieved bydefining a user-defined
data structure as a type.
 What is needed is a construct that allows both a data structure and
operations to be specified for userdefined types.
 we need a construct to define abstract data types.
 An abstract data type is a new type for which we can define the operations
to be used for manipulating instances, while the data structure that
implements the type is hidden to the users.
1 Abstract data types in C++
 Abstract data types can be defined in C++ through the class construct.
 A class encloses the definition of a new type and explicitly provides the
operations that can be invoked for correct use of instances of the type.
 As an example, shows a class defining the type of the geometrical concept of
point.
class point {
int x, y;
public:
point (int a, int b) { x = a; y = b; } // initializes the coordinates of a point
void x_move (int a) { x += a; } // moves the point horizontally
void y_move (int b ){ y += b; } // moves the point vertically
void reset ( ) { x = 0; y = 0; } // moves the point to the origin
};
 FIGURE 1.A C++ class defining point
 A class can be viewed as an extension of structures (or records), where
fields can be both data and routines.
 The difference is that only some fields (declared public) are accessible from
outside the class.
 Non-public fields are hidden to the users of the class.
 In the example, the class construct encapsulates both the definition of the
data structure defined to represent points (the two integer numbers x and
y) and of the operations provided to manipulate points.
 The data structure which defines a geometrical point (two integer
coordinates) is not directly accessible by users of the class.
 Rather, points can only be manipulated by the operations defined as public
routines, as shown by the following fragment:
 The fragment shows how operations are invoked on points by means of the
dot notation; that is, by writing “object_name.public_routine_name”.
point p1 (1, 3); // instantiates p1 and initializes its value
point p2 (55, 0); // instantiates p2 and initializes its value
point* p3 = new point (0, 0); // p3 points to the origin
p1.x_move (3); // moves p1 horizontally
p2.y_move (99); // moves p2 vertically
p1.reset ( ); // positions p1 at the origin
 The only exceptions are the invocations of constructors and destructors.
 A constructor is an operation that has the same name of the new type
being defined (in the example, point).
 A constructor is automatically invoked when an object of the class is
allocated.
 In the case of points p1 and p2, this is done automatically when the scope
in which they are declared is entered.
 In the case of the dynamically allocated point referenced by p3, this is
done when the new instruction is executed.
 A special type of constructor is a copy constructor.
 The constructor we have seen for point builds a point out of two int
values.
 copy constructor is able to build a point out of an existing point.
 The signature of the copy construtor would be:
point (point&)
 The copy constructor allows us to build a new object from an existing object
without knowing the components that constitute the object.
 When a parameter is passed by value to a procedure, copy construction is
used to build the formal parameter from the argument.
 Copy construction is almost similar to assignment with the difference that
on assignment, both objects exist whereas on copy construction, a new
object must be created first and then a value assigned to it.
Type systems
 Types are a fundamental semantic concept of programming languages.
 type system adopted by a language, defined as the set of rules used by the
language to structure and organize its collection of types.
 type is defined as a set of values and a set of operations that can be
applied to such values.
 As usual, since values in our context are stored somewhere in the memory
of a computer, we use the term object (or data object) to denote both the
storage and the stored value.
 The operations defined for a type are the only way of manipulating its
instance objects: they protect data objects from any illegal uses.
 Any attempt to manipulate objects with illegal operations is a type error.
 A program is said to be type safe (or type secure) if all operations in the
program are guaranteed to always apply to data of the correct type, i.e., no
type errors will ever occur.
1 Static versus dynamic program checking
 Type Errors can be classified in two categories: language errors and
application errors.
 Language errors are syntactic and semantic errors in the use of the
programming language.
 Application errors are deviations of the program behavior with respect to
specifications (assuming specifications capture the required behavior
correctly).
 The programming language should facilitate both kinds of errors to be
identified and removed.
 Error checking can be accomplished in different ways, that can be classified
in two broad categories: static and dynamic.
 Dynamic checking requires the program to be executed on sample input
data. Static checking does not.
 In general, if a check can be performed statically, it is preferable to do so
instead of delaying the check to run-time for two main reasons.
 First, potential errors are detected at run time only if one can provide input
data that cause the error to be revealed.
 For example, a type error might exist in a portion of the program that is not
executed by the given input data.
 Second, dynamic checking slows down program execution.
 Static checking is often called compile-time (or translation-time) checking.
 programs may be subject to separate compilation and some static checks
might occur at link time.
 For example, the possible mismatch between a routine called by one
module and defined in another might be checked at link time.
 Static checking, though preferable to dynamic checking, does not uncover
all language errors.
 Some errors only manifest themselves at run time.
 For example, if div is the operator for integer division, the compiler might
check that both operands are integer.
2 Strong typing and type checking
 The type system of a language was defined as the set of rules to be
followed to define and manipulate program data.
 Such rules constrain the set of legal programs that can be written in a
language.
 The goal of a type system is to prevent the writing of type unsafe programs
as much as possible.
 A type system is said to be strong if it guarantees type safety; i.e.,
programs written by following the restrictions of the type system are
guaranteed not to generate type errors.
 A language with a strong type system is said to be a strongly typed
language.
 If a language is strongly typed, the absence of type errors from programs
can be guaranteed by the compiler.
 A type system is said to be weak if it is not strong. Similarly, a weakly typed
 statically typed language. Such languages are said to obey to a static type
system.
 such a type system requires that the type of every expressions be known
at compile time.
 Example of a static type system can be achieved by requiring that
1. only built-in types can be used;
2. all variables are declared with an associated type;
3. all operations are specified by stating the types of the required
operands
and the type of the result.
 Examples of languages where the binding between a variable and its type
cannot be established at compile time, and yet the rules of the type
system guarantee type safety;
 There are two conflicting requirement: the size of the set of legal programs
and the efficiency of the type checking procedure performed by the
compiler.

You might also like