Data Structure and Algorithm Original Note
Data Structure and Algorithm Original Note
LECTURE NOTE ON
Mr Tajudeen A.A
Data Structure
The logical or mathematical model of a particular organization of data is called its data structures. A data item
is a single unit of values. It is a raw fact which becomes information after processing . Data items for example
, date are called group items if they can be divided into subsystems. The date for instance is represented by
the day, the month and umber is called an elementary item, because it can not be sub-divided into sud-items.
It is indeed treated as a single item. An entity is used to describe anything that has certain attributes or
propreties, which may be assigned values. For example , the following are possible attributes and their
corresponding values for an entity known as STUDENT.
ATTRIBUTES NAME AGE SEX MATRIC NO
VALUES Paul 21 Male 800654
Entities with similar attributes for example, all the 200 level Computer science & Statistics students form an
entity set.
Main functions of data Structures:
• Seek to identify and develop entities, operations and appropriate classes of problems to use them.
• Determine representations for abstract entities to implement abstract operations on concrete
representations.
Algorithm : A finite sequence of instructions, each of which has a clear meaning and can be executed with a
finite amount of effort in finite time. whatever the input values, an algorithm will definitely terminate after
executing a finite number of instructions.
Charasteristics of algorithm:
B i t - 0 ,1 char int
Boolean - compare word
Assignment NewChar(Id) ½ word
1 byte word - 28 of info to
short
manipulate.
4 byte word - 232 of info Assign(I1,I2) in t Saves storage space!
tomanipulate. l on g
Boolean - 2 Values AreEqual(I1,I2) MazSize()
NewBoolean(ident)
Create(Id)
Encode(Id)
MakeTrue ident=true Delete(Id)
(ident) Assign(I1,I2)
MakeFalse ident=false Decode(Id)
Equality(I1,I2)
(ident)
IsTrue (ident) ident=true Precedes(I1,I2) I1<I2 IsLessThan(I1,I2)
Value range
All possible values that could be assigned to a given atttibute of an entity set is called the range of values of
the attribute.
Data types
In mathematics it is customary to classify variables according to certain important characteristics. Clear
distinctions are made between real, complex, and logical variables or between variables representing
individual values, or sets of values, or sets of sets, or between functions, functionals, sets of functions, and so
on. This notion of classification is equally if not more important in data processing. We will adhere to the
principle that every constant, variable, expression, or function is of a certain type. This type essentially
characterizes the set of values to which a constant belongs, or which can be assumed by a variable or
expression, or which can be generated by a function .
Therefore, a data type is a set of values together with the operations defined on the values: {(values)
(operations)}. The operations are performed on the values defined. E.g integer (-4,-1,1,3,4) are values
while(+,-,*,/) are operations. Data types also allow us to associate meaning to sequence of bits in the
computer memory. Eg string AA@, integer 5 etc .
Graph.
Graphs are a commonly used data structure because they can be used to model many real-world problems. A
graph consists of a set of nodes with an arbitrary number of connections, or edges, between the nodes. These
edges can be either directed or undirected and weighted or unweighted. In this study we will examine the
basics of graphs and created a Graph class. This class was similar to the BinaryTree class , the difference being
that instead of only have a reference for at most two edges, the Graph class's GraphNodes could have an
arbitrary number of references. This similarity is not surprising because trees are a special case of graphs
Definitions
Definition1. A graph is a finite nonempty set of objects called vertices (the singular is vertex) together with a
(possibly empty) set of unordered pairs of distinct vertices of called edges. Graphs are a very expressive
formalism for system modeling, especially when attributes are allowed. Our research is mainly focused on the
use of graphs for system verification. Up to now, there are two main different approaches of modeling
(typed) attributed graphs and specifying their transformation. Here we report preliminary results of our
investigation on a third approach. In our approach we couple a graph to a data signature that consists of
unary operations only. Therefore, we transform arbitrary signatures into a structure comparable to what is
called a graph structure signature in the literature, and arbitrary algebras into the corresponding algebra
graph. Some authors call a graph by the longer term ``undirected graph'' and simply use the following
definition of a directed graph as a graph. However when using Definition 1 of a graph, it is standard practice
to abbreviate the phrase ``directed graph'' (as done below in Definition 2) with the word digraph.
Definition 2. A digraph is a finite nonempty set of vertices together with a (possibly empty) set of ordered
pairs of vertices of called arcs. An arc that begins and ends at a same vertex u is called a loop. We usually (but
not always) disallow loops in our digraphs. By being defined as a set, E does not contain duplicate (or
multiple) edges/arcs between the same two vertices. For a given graph (or digraph) G we also denote the set
of vertices by and the set of edges (or arcs) by is to lessen any ambiguity. , and the
Definition 3. The order of a graph (digraph)
size of this graph is
.
, sometimes denoted by
Sometimes we view a graph as a digraph where every unordered edge
directed arcs
digraph.
In the next example we display a graph
is 6 where
where
and a digraph
and
is replaced by two
. In this case, the size of a graph is half the size of the corresponding
both of order 5. The size of the graph
is 7
= {(0, 1), (0, 2), (1, 2), (2, 3), (2, 4), (3, 4)} while the size of the digraph
= {(0, 2), (1, 0), (1, 2), (1, 3), (3, 1), (3, 4), (4, 2)}.
and a digraph
is given below.
Example 4. A pictorial example of a graph
17
For a graph the in-degree and out-degree's are the same as the degree. For our graph
, we have
deg(0)=2, deg(1)=2, deg(2)=4, deg(3)=2 and deg(4)=2. We may concisely write this as a degree
sequence (2, 2, 4, 2, 2) if there is a natural ordering (e.g., 0,1,2,3,4) of the vertices. The in-degree
sequence and out-degree sequence of the digraph
are (1, 1, 3, 1, 1) and (1, 3, 0, 2, 1),
respectively. The degree of a vertex of a digraph is sometimes defined as the sum of its in-degree
and out-degree. Using this definition, a degree sequence of
would be (2, 4, 3, 3, 2).
i s t he
denotes the
Definition 10. The diameter of a connected graph (strongly connected digraph
least integer D such that for all vertices and in we have
, where
distance from to in , that is, the length of a shortest path between and .
Example 11. The diameter of graph
of Example 4 is 2. We calculated d(0,1)=1, d(0,2)=1,
d(0,3)=2, d(0,4)=2, d(1,2)=1, d(1,3)=2, d(1,4)=2, d(2,3)=1, d(2,4)=1 and d(3,4)=1. Note for
graphs, we have d(x, y) = d(y, x) for all vertices x and y.
Since the digraph
is not strongly connected the diameter is undefined. However, we can
compute shortest distances between various pairs of vertices: d(0,2)=1, d(1,0)=1, d(1,2)=1,
d(3,1)=1, d(3,0)=2, d(3,2)=2, d(3,4)=1 and d(4,2)=1.
Computer representations of graphs
There are two common computer representations for graphs (or digraphs), called adjacency
matrices and adjacency lists. For a graph of order , an adjacency matrix representation is a
boolean matrix (often encoded with 0's and 1's) of dimension n such that entry
only if edge/arc
is true if and
is in E( ). For a graph of order , an adjacency lists representation is lists
such that the -th list contains a sequence (often sorted) of out-neighbours of vertex of .
We can see the structure of these representations more clearly with examples.
Example 12. We give adjacency matrices for the graph
and digraph
of Example 4 below.
19
Notice that the 1's in the rows represent how many out-neighbours and the 1's in the columns
represent how many in-neighbours a vertex has.
Example 13. We give adjacency lists for the graph
and digraph
of Example 4 below.
Only the out-neighbours are listed in the adjacency lists representation. The numbers with colons
( ) denote the index of the lists (and are not really necessary). An empty list can occur (e.g., list
2 of the digraph
).
For a graph/digraph with vertices and edges, the adjacency matrix representation requires O(
) storage while the adjacency lists representation requires O( ) storage. So for sparse graphs
the latter is probably preferable. However, to check whether edge/arc
is in the graph the
adjacency matrix representation has constant-time lookup, while the adjacency lists
representation may require O( ) time in the worst case.
We mention that there are also other specialized graph representations besides the two mentioned
in this section. These data structures take advantage of the graph structure for improved storage
or access time, often for families of graphs sharing a common property.
20
W EEK 3 :
Symbol, and relations
The week Learning outcomes :
• Define symbols, and relations.
• Explain equivalence relation.
• Explain composite relation.
Relations
A binary relation is determined by specifying all ordered pairs of objects in that relation; it does
not matter by what property the set of these ordered pairs is described. We are led to the
following definition.
Definition. A set R is a binary relation if all elements of R are ordered pairs, i.e., if for any z R there
exist x and y such that z = (x, y).
It is customary to write xRy instead of (x, y) R. We say that x is in relation R with y if xRy
hol ds .
The set of all x which are in relation R with some y is called the domain of R and denoted by
"dom R." So dom R = {x | there exists y such that xRy}. dom R is the set of all first coordinates of
ordered pairs in R.
The set of all y such that, for some x, x is in relation R with y is called the range of R, denoted by
"ran R." So ran R = {y | there exists x such that xRy}.
Symbol
A symbol is something such as an object, picture, written word, sound, or particular mark that
represents something else by association, resemblance, or convention. For example, a red
octagon may stand for "STOP". On maps, crossed sabres may indicate a battlefield. Numerals are
symbols for numbers.
21
All language consists of symbols. The word "cat" is not a cat, but represents the idea of a cat.
Language and symbols
All languages are made up of symbols. Spoken words are the symbols of mental experience, and
written words are the symbols of spoken words.
The word "cat", for example, whether spoken or written, is not a literal cat but a sequence of
symbols that by convention associate the word with a concept. Hence, the written or spoken
word "cat" represents (or stands for) a particular concept formed in the mind. A drawing of a cat,
or a stuffed cat, could also serve as a symbol for the idea of a cat.
The study or interpretation of symbols is known as symbology, and the study of signs is known
as semiotics.
Symbols and Corresponding HTML Entities
If the leftmost column below shows ≡, a square or nothing instead of the actual symbol, your
browser does not support HTML entities; please use the picture version of this document instead.
Relational Operators 3
Symbol
LaTeX Command 2
HTML Entity 1
Comment
∝
\equiv
\approx
\propto
\simeq
≡
≈
∝
∼
\sim
\neq
\geq
\gg
\ll
∼
≠
22
Logic Symbols 3
Symbol
LaTeX Command 2
HTML Entity 1
Comment
¬
⊕
\neg
\wedge
\vee
\oplus
\Rightarrow
\Leftrightarrow
¬
∧
∨
⊕
\exists
\forall
∃
∀
Set Symbols 3
Symbol
LaTeX Command 2
HTML Entity 1
Comment
\cap
\cup
\supset
\subset
\emptyset
∩
∪
⊃
⊂
∅
requires the amsfonts and amssymb
packages.
∈
∉
requires the latexsym package (present in
most LaTeX distributions).
∅
\mathbb{Z}
∉
\in
\notin
\Join
23
Equivalence relation
Equivalence relation is a binary relation between two elements of a set which groups them
together as being "equivalent" in some way. Let a, b, and c be arbitrary elements of some set X.
Then "a ~ b" or "a b" denotes that a is equivalent to b.
An equivalence relation "~" is reflexive, symmetric, and transitive. In other words, the following
must hold for "~" to be an equivalence relation on X:
An equivalence relation partitions a set into several disjoint subsets, called equivalence classes.
All the elements in a given equivalence class are equivalent among themselves, and no element
is equivalent with any element from a different class.
•
•
•
•
Reflexivity: a ~ a
Symmetry: if a ~ b then b ~ a
Transitivity: if a ~ b and b ~ c then a ~ c.
The equivalence class of a under "~", denoted [a], is the subset of X for which every element b,
a~b. X together with "~" is called a setoid.
Examples of equivalence relations
A ubiquitous equivalence relation is the equality ("=") relation between elements of any set.
Other examples include:
•
•
"Has the same birthday as" on the set of all people, given naive set theory.
"Is similar to" or "congruent to" on the set of all triangles.
24
•
•
•
•
•
•
•
•
•
"Is congruent to modulo n" on the integers.
"Has the same image under a function" on the elements of the domain of the function.
Logical equivalence of logical sentences.
"Is isomorphic to" on models of a set of sentences.
In some axiomatic set theories other than the canonical ZFC (e.g., New Foundations and related
theories):
o Similarity on the universe of well-orderings gives rise to equivalence classes that are the
ordinal numbers.
o Equinumerosity on the universe of:
Finite sets gives rise to equivalence classes which are the natural numbers.
Infinite sets gives rise to equivalence classes which are the transfinite cardinal
numbers.
Let a, b, c, d be natural numbers, and let (a, b) and (c, d) be ordered pairs of such numbers. Then
the equivalence classes under the relation (a, b) ~ (c, d) are the:
o Integers if a + d = b + c;
o Positive rational numbers if ad = bc.
Let (rn) and (sn) be any two Cauchy sequences of rational numbers. The real numbers are the
equivalence classes of the relation (rn) ~ (sn), if the sequence (rn sn) has limit 0.
Green's relations are five equivalence relations on the elements of a semigroup.
"Is parallel to" on the set of subspaces of an affine space.
Examples of relations that are not equivalences
•
•
•
•
•
•
The relation " " between real numbers is reflexive and transitive, but not symmetric. For
example, 7 5 does not imply that 5 7. It is, however, a partial order.
The relation "has a common factor greater than 1 with" between natural numbers greater than
1, is reflexive and symmetric, but not transitive. (The natural numbers 2 and 6 have a common
factor greater than 1, and 6 and 3 have a common factor greater than 1, but 2 and 3 do not have
a common factor greater than 1).
The empty relation R on a non-empty set X (i.e. aRb is never true) is vacuously symmetric and
transitive, but not reflexive. (If X is also empty then R is reflexive.)
The relation "is approximately equal to" between real numbers, even if more precisely defined,
is not an equivalence relation, because although reflexive and symmetric, it is not transitive,
since multiple small changes can accumulate to become a big change. However, if the
approximation is defined asymptotically, for example by saying that two functions f and g are
approximately equal near some point if the limit of f-g is 0 at that point, then this defines an
equivalence relation.
The relation "is a sibling of" on the set of all human beings is not an equivalence relation.
Although siblinghood is symmetric (if A is a sibling of B, then B is a sibling of A) it is neither
reflexive (no one is a sibling of himself), nor transitive (since if A is a sibling of B, then B is a
sibling of A, but A is not a sibling of A). Instead of being transitive, siblinghood is "almost
transitive", meaning that if A ~ B, and B ~ C, and A C, then A ~ C. However, the relation "A is a
sibling of B or A is B" is an equivalence relation. (This applies only to full siblings. A and B could
have the same mother, and B and C the same father, without A and C having a common parent.)
The concept of parallelism in ordered geometry is not symmetric and is, therefore, not an
equivalence relation.
25
•
An equivalence relation on a set is never an equivalence relation on a proper superset of that
set. For example R = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)} is an equivalence
relation on {1,2,3} but not on {1,2,3,4} or on the natural number. The problem is that reflexivity
fails because (4,4) is not a member.
Connection to other relations
•
•
•
•
•
•
A congruence relation is an equivalence relation whose domain X is also the underlying set for
an algebraic structure, and which respects the additional structure. In general, congruence
relations play the role of kernels of homomorphisms, and the quotient of a structure by a
congruence relation can be formed. In many important cases congruence relations have an
alternative representation as substructures of the structure on which they are defined. E.g. the
congruence relations on groups correspond to the normal subgroups.
A partial order replaces symmetry with antisymmetry and is thus reflexive, antisymmetric, and
transitive. Equality is the only relation that is both an equivalence relation and a partial order.
A strict partial order is irreflexive, transitive, and asymmetric.
A partial equivalence relation is transitive and symmetric. Transitive and symmetric imply
reflexive iff for all a X exists b X such that a~b.
A dependency relation is reflexive and symmetric.
A preorder is reflexive and transitive.
Equivalence class, quotient set, partition
Let X be a nonempty set with typical elements a and b. Some definitions:
•
•
•
The set of all a and b for which a ~ b holds make up an equivalence class of X by ~. Let [a] =: {x
X : x ~ a} denote the equivalence class to which a belongs. Then all elements of X equivalent to
each other are also elements of the same equivalence class: a, b X (a ~ b [a ] = [b ]).
The set of all possible equivalence classes of X by ~, denoted X/~ =: {[x] : x X}, is the quotient
set of X by ~. If X is a topological space, there is a natural way of transforming X/~ into a
topological space; see quotient space for the details.
The projection of ~ is the function : X X/~, defined by (x) = [x ], mapping elements of X into
their respective equivalence classes by ~.
Theorem on projections (Birkhoff and Mac Lane 1999: 35, Th. 19): Let the function f: X B be
such that a ~ b f(a) = f(b). Then there is a unique function g : X/~ B, such that f = g . If f is a
surjection and a ~ b f(a) = f(b), then g is a bijection.
•
•
The equivalence kernel of a function f is the equivalence relation, denoted Ef, such that xEfy
f(x) = f(y). The equivalence kernel of an injection is the identity relation.
A partition of X is a set P of subsets of X, such that every element of X is an element of a single
element of P. Each element of P is a cell of the partition. Moreover, the elements of P are
pairwise disjoint and their union is X.
26
Theorem ("Fundamental Theorem of Equivalence Relations": Wallace 1998: 31, Th. 8; Dummit
and Foote 2004: 3, Prop. 2):
•
•
An equivalence relation ~ partitions X.
Conversely, corresponding to any partition of X, there exists an equivalence relation ~ on X.
In both cases, the cells of the partition of X are the equivalence classes of X by ~. Since each
element of X belongs to a unique cell of any partition of X, and since each cell of the partition is
identical to an equivalence class of X by ~, each element of X belongs to a unique equivalence
class of X by ~. Thus there is a natural bijection from the set of all possible equivalence relations
on X and the set of all partitions of X.
Counting possible partitions. Let X be a finite set with n elements. Since every equivalence
relation over X corresponds to a partition of X, and vice versa, the number of possible
equivalence relations on X equals the number of distinct partitions of X, which is the nth Bell
number Bn:
Generating equivalence relations
•
Given any set X, there is an equivalence relation over the set of all possible functions X X. Two
such functions are deemed equivalent when their respective sets of fixpoints have the same
cardinality, corresponding to cycles of length one in a permutation. Functions equivalent in this
manner form an equivalence class on X2, and these equivalence classes partition X2.
An equivalence relation ~ on X is the equivalence kernel of its surjective projection : X X/~.
(Birkhoff and Mac Lane 1999: 33 Th. 18). Conversely, any surjection between sets determines a
partition on its domain, the set of preimages of singleton [disambiguation needed]s in the codomain.
Thus an equivalence relation over X, a partition of X, and a projection whose domain is X, are
three equivalent ways of specifying the same thing.
The intersection of any collection of equivalence relations over X (viewed as a subset of X × X) is
also an equivalence relation. This yields a convenient way of generating an equivalence relation:
given any binary relation R on X, the equivalence relation generated by R is the smallest
equivalence relation containing R. Concretely, R generates the equivalence relation a ~ b iff
there exist elements x1, x2, ..., xn in X such that a = x1, b = xn, and (xi,xi+ 1) R or (xi+1,xi) R, i = 1, ...,
n-1.
Note that the equivalence relation generated in this manner can be trivial. For instance, the
equivalence relation ~ generated by:
•
•
27
•
•
The binary relation has exactly one equivalence class, X itself, because x ~ y for all x
and y;
An antisymmetric relation has equivalence classes that are the singletons[disambiguation
ne e de d]
of X.
•
Let r be any sort of relation on X. Then r r 1 is a symmetric relation. The transitive closure s of r
r 1 assures that s is transitive and reflexive. Moreover, s is the "smallest" equivalence relation
containing r, and r/s partially orders X/s.
•
Equivalence relations can construct new spaces by "gluing things together." Let X be the unit
Cartesian square [0,1] × [0,1], and let ~ be the equivalence relation on X defined by a, b [0,1]
((a, 0) ~ (a, 1) (0, b) ~ (1, b)). Then the quotient space X/~ can be naturally identified with a
torus: take a square piece of paper, bend and glue together the upper and lower edge to form a
cylinder, then bend the resulting cylinder so as to glue together its two open ends, resulting in a
torus.
Composite Relations
If the elements of a set A are related to those of a set B, and those of B are in turn related to the
elements of a set C, then one can expect a relation between A and C. For example, if Tom is my
father(parent-child relation) and Sarah is a sister of Tom (sister relation), then Sarah is my aunt
(aunt-nephew/niece relation). Composite relations give that kind of relations.
Definition(composite relation): Let R1 be a binary relation from a set A to a set B, R2 a binary
relation from B to a set C. Then the composite relation from A to C denoted by R1R2(also
denoted by R1
R2 is defined as
R1R2 = {<a, c> | a
Ac
C
b [b
B <a, b>
R1 <b, c>
R2 ] } .
In English, this means that an element a in A is related to an element c in C if there is an element
b in B such that a is related to b by R1 and b is related to c by R2 . Thus R1R2 is a relation from A
to C via B in a sense. If R1 is a parent-child relation and R2 is a sister relation, then R1R2 is an
aunt-nephew/niece relation.
Example 1: Let A = {a1 , a2} , B = {b1 , b2 , b3} , and C = {c1 , c2} . Also let R1 = {<a1 , b1> , <a1 ,
b2> , <a2 , b3> } , and R2 = {<b1 , c1> , <b2 , c1> , <b2 , c2> , <b3 , c1> } . Then R1R2 = {<a1 , c1> ,
<a1 , c2> , <a2 , c1> } .
This is illustrated in the following figure. The dashed lines in the figure of R1R2 indicate the
ordered pairs in R1R2, and dotted lines show ordered pairs that produce the dashed lines. (The
lines in the left figure are all supposed to be solid lines.)
28
Example 2: If R is the parent-child relation on a set of people A, then RR, also denoted by R2, is
the grandparent-grandchild relation on A.
More examples: 2
The digraphs of R for several simple relations R are shown below:
Properties of Composite
Relations
Composite relations
defined above have the
following properties. Let
R1 be a relation from A to
B, and R2 and R3 be
relations from B to C.
Then
1. R1(R2R3) =
(R1R2)R3
2. R1(R2 R3) =
R1R2 R1R3
3. R1(R2 R3)
R1R2
R1R3
Proofs for these
properties are not
necessary
Powers of Relation
Let R be a binnary relation
on A. Then R for all
positive integers n is
defined recursively as
follows:
Basis Clause: R0 = E, where E is the equality relation on A.
29
Definition(power of
relation):
3. Set the second participant attribute to the key of the child attribute.
Tip: Expand the parent business object, then expand the child attribute within the parent.
Select the key attribute from this child object.
4. Repeat steps 1-3 for each of the participants. As with all composite identity relationships,
this relationship contains one participant for the generic business object and at least one
participant for a application-specific business object. Each participant consists of two
attributes: the key of the parent business object and the key of the child business object
(from the attribute within the parent business object).
Restriction: To manage composite relationships, the server creates internal tables. A table is
created for each role in the relationship. A unique index is then created on these tables across all
key attributes of the relationship. (In other words, the columns which correspond to the key
attributes of the relationship are the participants of the index.) The column sizes of the internal
tables have a direct relation to the attributes of the relationship and are determined by the value
of the MaxLength attribute for the relationship.
Databases typically have restrictions on the size of the indexes that can be created. For instance,
DB2 has an index limitation of 1024 bytes with the default page size. Thus, depending on the
MaxLength attribute of a relationship and the number of attributes in a relationship, you could
run into an index size restriction while creating composite relationships.
Important:
•
You must ensure that appropriate MaxLength values are set in the repository file for all
key attributes of a relationship, such that the total index would never exceed the index
size limitations of the underlying DBMS.
If the MaxLength attribute for type String is not specified, the default is nvarchar(255) in
the SQLServer. Thus, if a relationship has N Keys, all of type String and the default
MaxLength attribute of 255 bytes, the index size would be ((N*255)*2) + 16 bytes. You
can see that you would exceed the SQLServer 7 limit of 900 bytes quite easily when N
takes values of >=2 for the default MaxLength value of 255 bytes for type String.
•
Remember, too, that even when some DBMS'es support large indexes, it comes at the
cost of performance; hence, it is always a good idea to keep index sizes to the minimum.
31
The maintainCompositeRelationship() method deals only with composite keys that extend to
only two nested levels. In other words, the method cannot handle the case where the child
object's composite key depends on values in its grandparent objects.
Example: If A is the top-level business object, B is the child of A, and C is the child of B, the
two methods will not support the participant definitions for the child object C that are as follows:
•
•
The participant type is A and the attributes are:
•key attribute of A: ID
•key attribute of B: B[0].ID
•key attribute of C: B[0].C[0].ID
The participant type is A and the attributes are:
•key attribute of A: ID
•key attribute of C: B[0].C[0].ID
To access a grandchild object, these methods only support the participant definitions that are as
follows:
•
•
The participant type is B and the attributes are:
•key attribute of B: ID
•key attribute of C: C[0].ID
The participant type is B and the attributes are:
•key attribute of B: ID
•first key attribute of C: C[0].ID1
•second key attribute of C: C[0].ID2
Actions of General/APIs/Identity Relationship/Maintain Child Verb
The Maintain Child Verb function block will generate Java code that calls the mapping API
maintainChildVerb(), which will maintain the verb of the child objects in the destination
business object. It can handle child objects whose key attributes are part of a composite identity
relationship. When you call maintainChildVerb() as part of a composite relationship, make sure
that its last parameter has a value of true. This method ensures that the verb settings are
appropriate given the verb in the parent source object and the calling context.
Customizing map rules for a composite identity relationship
Once you have created the relationship definition and participant definitions for the composite
identity relationship, you can customize the map to maintain the composite identity relationship.
33
A composite identity relationship manages a composite key. Therefore, managing this kind of
relationship involves managing both parts of the composite key. To code a composite identity
relationship, you need to customize the mapping transformation rules for both the parent and
child business objects, as Table 3.1 shows.
Table 3.1 Activity function blocks for a composite identity relationship
Map
involved
Main
Business
object
involved
Parent
business
object
Attribute
Top-level
business
object
C hi l d
attribute
(child
business
object)
Activity function blocks
Use a Cross-Reference transformation
rule
General/APIs/Identity
Relationship/Maintain Composite
Relationship
General/APIs/Identity
Relationship/Maintain Child Verb
General/APIs/Identity
Relationship/Update My Children
(optional)
Define a Move or Set Value
transformation for the verb.
Submap
C hi l d
business
object
Key
attribute
(nonunique
key)
If child business objects have a nonunique key attribute, you can relate these child business
objects in a composite identity relationship.
The following sections describe the steps for customizing this composite identity relationship:
•
•
•
Steps for customizing the main map
Customizing the submap
Managing child instances
34
Steps for customizing the main map
In the map for the parent business object (the main map), add the mapping code to the parent
attributes:
1. Map the verb of the top-level business object by defining a Move or Set Value
transformation rule.
2. Define a Cross-Reference transformation between the top-level business objects.
3. Define a Custom transformation for the child attribute and use the General/APIs/Identity
Relationship/Maintain Composite Relationship function block in Activity Editor.
Steps for coding the child attribute
The child attribute of the parent object contains the child business object. This child object is
usually a multiple cardinality business object. It contains a key attribute whose value identifies
the child. However, this key value is not required to be unique. Therefore, it does not uniquely
identify one child object among those for the same parent nor is it sufficient to identify the child
object among child objects for all instances of the parent object.
To identify such a child object uniquely, the relationship uses a composite key. In the composite
key, the parent key uniquely identifies the parent object. The combination of parent key and
child key uniquely identifies the child object. In the map for the parent business object (the main
map), add the mapping code to the attribute that contains the child business object. In Activity
Editor for this attribute, perform the following steps to code a composite identity relationship:
1. Define a Submap transformation for the child business object attribute of the main map.
Usually mapping transformations for a child object are done within a submap, especially
if the child object has multiple cardinality.
2. In the main map, define a Custom transformation rule for the child verb and use the
General/APIs/Identity Relationship/Maintain Child Verb function block to maintain the
child business object's verb.
The last input parameter of the General/APIs/Identity Relationship/Maintain Child Verb
function block is a boolean flag to indicate whether the child objects are participating in a
composite relationship. Make sure you pass a value of true as the last argument to
maintainChildVerb() because this child object participates in a composite, not a simple
identity relationship. Make sure you call maintainChildVerb() before the code that calls
the submap.
35
3. To maintain this composite key for the parent source object, customize the mapping rule
to use the General/APIs/Identity Relationship/Maintain Composite Relationship function
block.
4. To maintain the relationship tables in the case where a parent object has an Update verb
caused by child objects being deleted, customize the mapping rule to use the
General/APIs/Identity Relationship/Update My Children function block.
Tip: Make sure the transformation rule that contains the Update My Children function
block has an execution order after the transformation rule that contains the Maintain
Composite Relationship function block.
Example of customizing the map for a Composite Identity Relationship
The following example describes how the map can be customized for a Composite Identity
Relationship.
1. In the main map, define a Custom transformation rule between the child business object's
verbs. Use the General/APIs/Identity Relationship/Maintain Child Verb function block in
the customized activity to maintain the verb for the child business objects.
The goal of this custom activity is to use the maintainChildVerb() API to set the child
business object verb based on the map execution context and the verb of the parent
business object. Figure3.2 shows this custom activity.
Figure 3.2. Using the Maintain Child Verb function block
36
2. If necessary, define a Submap transformation rule between the child business object to
perform any mapping necessary in the child level.
3. Define a Custom transformation rule between the top-level business objects. Use the
General/APIs/Identity Relationship/Maintain Composite Relationship function block in
the customized activity to maintain the composite identity relationship for this map.
The goal of this custom activity is to use the maintainComposite Relationship() API to
maintain a composite identity relationship within the map. Figure 3.3 shows this custom
activity.
Figure 3.3. Using the Maintain Composite Relationship function block
4. Define a Custom transformation rule mapping from the source top-level business object
to the destination child business object attribute. Use the General/APIs/Identity
Relationship/Update My Children function block in the customized activity to maintain
the child instances in the relationship.
The goal of this custom activity is to use the updateMyChildren() API to add or delete
child instances in the specified parent/child relationship of the identity relationship.
37
int currentBusObjIndex_For_ObjSAP_Order_SAP_OrderLineItem;
int lastInputIndex_For_ObjSAP_Order_SAP_OrderLineItem =
srcCollection_For_ObjSAP_Order_SAP_OrderLineItem.getLastIndex();
// ----
IdentityRelationship.maintainChildVerb(
"OrdrLine",
"SAPOrln",
"CWOrln",
ObjSAP_Order,
"SAP_OrderLineItem",
ObjOrder,
"OrderLineItem",
cwExecCtx,
true,
true);
// ----
for (currentBusObjIndex_For_ObjSAP_Order_SAP_
OrderLineItem = 0;
currentBusObjIndex_For_ObjSAP_Order_SAP_OrderLineItem <=
lastInputIndex_For_ObjSAP_Order_SAP_OrderLineItem;
currentBusObjIndex_For_ObjSAP_Order_SAP_OrderLineItem++)
{
BusObj currentBusObj_For_ObjSAP_Order_SAP_OrderLineItem =
(BusObj) (srcCollection_For_ObjSAP_Order_SAP_OrderLineItem.elementAt(
currentBusObjIndex_For_ObjSAP_Order_SAP_OrderLineItem));
//
// INVOKE MAP ON VALID OBJECTS
// ---------------------------
//
// Invoke the map only on those children objects that meet
// certain criteria.
//
if (currentBusObj_For_ObjSAP_Order_SAP_OrderLineItem != null)
{
BusObj[] _cw_inObjs = new BusObj[2];
_cw_inObjs[0] =
currentBusObj_For_ObjSAP_Order_SAP_OrderLineItem;
_cw_inObjs[1] = ObjSAP_Order;
logInfo ("*** Inside SAPCW header, verb is: " +
(_cw_inObjs[0].getVerb()));
tr y
{
39
W EEK 4 :
SETS
This week Learning outcomes:
• Define sets .
• Define set operations.
• Define the elements of set , subsets, super sets, universal sets and null set.
• Explain Storage set representation.
The type SET
The objects of study of Set Theory are sets. As sets are fundamental objects that can be used to
define all other concepts in mathematics, they are not defined in terms of more fundamental
concepts. Rather, sets are introduced either informally, and are understood as something self-
evident, or, as is now standard in modern mathematics, axiomatically, and their properties are
postulated by the appropriate formal axioms.
The language of set theory is based on a single fundamental relation, called membership. We say
that A is a member of B (in symbols A B), or that the set B contains A as its element. The
understanding is that a set is determined by its elements; in other words, two sets are deemed
equal if they have exactly the same elements. In practice, one considers sets of numbers, sets of
points, sets of functions, sets of some other sets and so on. In theory, it is not necessary to
distinguish between objects that are members and objects that contain members -- the only
objects one needs for the theory are sets. See the supplement
The type SET denotes sets whose elements are integers in the range 0 to a small number,
typically 31 or 63.
Given, for example, variables
VAR r, s, t: SET
possible assignments are
r : = { 5 } ; s : = { x , y .. z } ; t : = { }
Here, the value assigned to r is the singleton set consisting of the single element 5; to t is
assigned the
empty set, and to s the elements x, y, y+1, , z-1, z.
42
Set Operations
A relation is a set. It is a set of ordered pairs if it is a binary relation, and it is a set of ordered n-
tuples if it is an n-ary relation. Thus all the set operations apply to relations such as , , and
complementing.
For example, the union of the "less than" and "equality" relations on the set of integers is the
"less than or equal to" relation on the set of integers. The intersection of the "less than" and "less
than or equal to" relations on the set of integers is the "less than" relation on the same set. The
complement of the "less than" relation on the set of integers is the "greater than or equal to"
relation on the same set.
Therefore, the following are the elementary operators are defined on variables of type SET:
* set intersection
+ set union
- set difference
/ symmetric set difference
IN set membership
Constructing the intersection or the union of two sets is often called set multiplication or set
addition, respectively; the priorities of the set operators are defined accordingly, with the
intersection operator having priority over the union and difference operators, which in turn have
priority over the membership operator, which is classified as a relational operator. Following are
examples of set expressions and their fully parenthesized equivalents:
r * s + t = (r*s) + t
r - s * t = r - (s*t)
r - s + t = (r-s) + t
THIS IS A TEXT
18
r + s / t = r + (s/t)
x IN s + t = x IN (s+t)
Sets, Elements, and Subsets
One dictionary has, among the many definitions for set, the following: a number of things naturally
connected by location, formation, or order in time.
Although set holds the record for words with the most dictionary definitions, there are terms
mathematicians choose to leave undefined, or actually, defined by usage. Set, element, member,
and subset are four such terms which will be discussed in today's lesson. Today's activity will
also explore the concept.
43
A proper subset does not contain every member of the original set.
Sets may be finite, {1, 2, 3,..., 10}, or infinite, {1, 2, 3,...}. The cardinality of a set A, n(A), is
how many elements are in the set. The symbol ... called ellipses means to continue in the
indicated pattern. There are 2n subsets of any set, where n is the set's cardinality—check it out for
n=3!
The power set of a set is the complete set of subsets of the set.
In this class we will consider only safe sets, that is, any set we consider should be well-defined.
There should be no ambiguity as to whether or not an element belongs to a set. That is why we
will avoid things like the village barber who shaves everyone in the village that does not shave
himself. This results in a contradiction as to whether or not he shaves himself. Also consider
Russell's Paradox: Form the set of sets that are not members of themselves. It is both true and
false that this set must contain itself. These are examples of ill-defined sets.
Sometimes, instead of listing elements in a set, we use set builder notation: {x| x is a letter in the
word "mathematics"}. The symbol | can be read as "such that." Sometimes the symbol
reserved to mean proper subset and the symbol
subset. Compare this with the use of < and
is
is used to allow the inclusion of the improper
to exclude or include an endpoint. We will make no
such distinction. A set may contain the same elements as another set. Such sets are equal or
identical sets— element order is unimportant. A = B where A = {m,o,r,e} and B = {r,o,m,e}, in
general A=B if A
B and B
A. Sets may be termed equivalent if they have the same
cardinality. If they are equivalent, a one-to-one correspondence can be established between
their elements.
The universal set is chosen arbitrarily, but must be large enough to include all elements of all sets under
discussion.
Complementary set, A', is a set that contains all the elements of the universal set that are not included
in A. The symbol ' can be read "prime."
45
For example: if U={0, 1, 2, 3, 4, 5, 6,...} and A={0, 2, 4, 6, ...}, then A'={1, 3, 5, ...}.
Such paradoxes as those mentioned above, particularily involving infinities (discussed in the
next lesson), were well known by the ancient Greeks. During the 19th century, mathematicians
were able to tame such paradoxes and about the turn of the 20th century Whitehead and Russell
started an ambitious project to carefully codify mathematics. Set theory was developed about this
time and serves to unify the many branches of mathematics. Although in 1931 Kurt Gödel
showed this approach to be fatally flawed, it is still a good way to explore areas of mathematics
such as: arithmetic, number theory, [abstract] algebra, geometry, probability, etc.
Geometry has a long history of such systematic study. The ancient Greek Euclid similarily
codified the mathematics of his time into 13 books called The Elements. Although these books
were not limited to Geometry, that is what they are best known for. In fact, up until about my
grandfather's day, The Elements was the textbook of choice for the study of Geometry! The
Elements carefully separated the assumptions and definitions from what was to be proved. The
concept of proof dates back another couple hundred years to the ancient Greek Pythagoras and
his school, the Pythagorean School.
Intersection and Union
Once we have created the concept of a set, we can manipulate sets in useful ways termed set
operations. Consider the following sets: animals, birds, and white things. Some animals are
white: polar bears, mountain goats, big horn sheep, for example. Some birds are white: dove,
stork, sea gulls. Some white things are not birds or animal (but birds are animals!): snow, milk,
wedding gowns (usually).
The intersection of sets are those elements which belong to all intersected sets.
Although we usually intersect only two sets, the definition above is general. The symbol for
intersection is
.
The union of sets are those elements which belong to any set in the union.
46
Again, although we usually form the union of only two sets, the definition above is general. The
symbol for union is
.
For the example given above, we can see that:
{white things} {birds} = white birds
{white animals} {birds} = white animals and all birds
{white birds} {white animals} {animals}
Another name for intersection is conjuction. This comes from the fact that an element must be a
member of set A and set B to be a member of A
B. Another name for union is disjunction.
This comes from the fact that an element must be a member of set A or set B to be a member of
A B. Conjunction and disjunction are grammar terms and date back to when Latin was widely
used.
I should note the very mathematical use of the word or in the sentence above. Common usage now of
the word or means one or the other, but not both (excludes both). Mathematicians and computer
scientists on the other hand mean one or the other, possibly both (including both). This ambiguity can
cause all kinds of problems! Mathematicians term the former exclusive or (EOR or XOR) and the latter
inclusive or. We will see ands & ors again in numbers lesson 6 on truth tables.
Representation of Sets
A set s is conveniently represented in a computer store by its characteristic function C(s). This is
an array of logical values whose ith component has the meaning "i is present in s". As an
example, the set of small
integers s = {2, 3, 5, 7, 11, 13} is represented by the sequence of bits, by a bitstring:
C(s) = ( 0010100010101100)
The representation of sets by their characteristic function has the advantage that the operations of
computing the union, intersection, and difference of two sets may be implemented as elementary
logical operations. The following equivalences, which hold for all elements i of the base type of
the sets x and y, relate logical operations with operations on sets:
i IN (x+y) = (i IN x) OR (i IN y)
i IN (x*y) = (i IN x) & (i IN y)
i IN (x-y) = (i IN x) & ~(i IN y)
These logical operations are available on all digital computers, and moreover they operate
concurrently on
all corresponding elements (bits) of a word. It therefore appears that in order to be able to
implement the
basic set operations in an efficient manner, sets must be represented in a small, fixed number of
words upon
which not only the basic logical operations, but also those of shifting are available. Testing for
membership
47
is then implemented by a single shift and a subsequent (sign) bit test operation. As a
consequence, a test of
the form x IN {c1, c2, ... , cn} can be implemented considerably more efficiently than the
equivalent
Boolean expression
(x = c1) OR (x = c2) OR ... OR (x = cn)
A corollary is that the set structure should be used only for small integers as elements, the largest
one being
the wordlength of the underlying computer (minus 1).
48
W E E K 5:
String structure.
This week Learning outcomes :
• Define Define string .
•.Explain basic operations of strings.
String Processing
A finite sequence S of zero or more characters is called a string. The number of character
in a string is called its length. The string with zero character is called the empty string
or the null string. The following are strings of length 9,18, 14 and 0 respectively:
I.) "ND1 CLASS"
II.) "COMPUTER DEPARTMENT"
III.) "CAMPUS SHUTTLE" ,and
IV.) " "
Note that the blank is regarded as a character only when it appears swith other
characters .
Concatenation of Strings
Let S1 and S2 be strings, the string consisting of the characters of S1 followed by the
characters of S1 is called concatenation of S1 and S1.
This is donated by S1//S2. E.g. "STARLETS"//"//"DEFEAT"//"EA..GLET"
A string Y is called a substring of a string S if there exist strings X and Z such that
S=X//Y//Z.
If X is an empty string, then Y is called an intial substing of S, and if Z is an empty
string then Y is called a terminal substring of S.
E.g.'BE OR NOT' is a substring of 'TO BE OR NOT TO BE'.
'THE' is an initial substring of 'THE END'.
Length
The general form is LENGTH(string) and this will return the number of character(s) in a
giving string.
i.) LENGTH('student')=7
ii.) LENGTH(' ')=0
Insertion
Suppose we want to insert a strings in a given text T so that S starts in position K. We
denote this operation as INSERT(text, position, string)
E.g. INSERT('ABCDEFG', 3,'XYZ')='ABXYZCDEFG'
The INSERT function can be implemented by using the string operation as follows:
INSERT(T,K,S)=SUBSTRING(T,1,K-1)//S//SUBSTRING(T,K,LENGTH(T)-K+1).
49
That is, the initial substring of T before the position K , which has length K-1, is
concatenated with the string, and the result is concatenated with the remaining part of T,
which begins in position K and has length, LENGTH(T)-(K-1)=LENGTH(T)-K+1.
Deletion
The general form is DELETE(text,position,length).
E.g. DELETE('ABCDEFG',4,2)='ABCFG'
We assume that nothing is deleted if position K=0.
Thus DELETE('ABCDEFG',0,2)='ABCDEFG'
The DELETE function can be implemented using the string operations given as follows:
DELETE(T,K,L)=SUBSTRING(T,1,K-1)//SUBSTRING(T,K+L,LENTGH(T)-K-L+1)
That is the initial substring of T before position K is concatenated with the terminal
substring of T beginning in position K+L, and the length of the terminal substring is :
LENGTH(T)-(K+L-1)=LENGTH(T)-K-L+1
When K=0, we assume that DELETE(T,K,L)=T
Suppose that text T and pattern P are given and it is required to delete from T the first
occurrence of the pattern P. We can use the following DELETE function
DELETE(T,INDEX(T,P),LENGTH(P))
E.g. = 'ABCDEFG', P='CD', then
DELETE(T,INDEX(T,P),LENGTH(P))=DELETE('ABCDEFG',INDEX('ABCDEFG','CD')
,2)='ABEFG'
Suppose that we want to delete every occurrence of the pattern P in the text T, then we
can do this by repeatedly applying DELETE(T,INDEX(T,P),LENGTH(P))
Until INDEX(T,P)=0 that is, until P does not appear in T.
The following algorithm is used to accomplish this:
Algorithm:
A text T and a pattern P are in computer memory. This algorithm deletes every
occurrence of P in T.
i.) Find index of P in T. Set K=INDEX(T,P)
ii.) Repeat while K not equal to 0
a. [Delete P from T.]
Set T:= DELETE(T,INDEX(T,P),LENGTH(P))
b. [Update index]
Set K:= INDEX(T,P)
[ En d o f l o o p ]
iii.)Write: T.
i v . ) Ex i t .
REPLACEMENT
50
W E E K 6:
Queues and Stacks.
This week learning outcomes:
• Queue data structure .
• Stack data structure..
Q ue ue s
Queues are dynamic collections which have some concept of order. This can be either based on
order of entry into the queue - giving us First-In-First-Out (FIFO) or Last-In-First-Out (LIFO)
queues. Both of these can be built with linked lists: the simplest "add-to-head" implementation of
a linked list gives LIFO behaviour. A minor modification - adding a tail pointer and adjusting the
addition method implementation - will produce a FIFO queue.
Representation of queues:
Two pointer variable namely FRONT and REAR are used in quiues, FRONT contains
the location of the front element of the queue, and REAR contains the location of the
rear element of the queue. The condition FRONT=NULL will indicate that the queue is
empty.
Whenever an item is deleted from the queue, the value of FRONT is increased by 1 ,
that is , FRONT=FRONT + 1.
Whenever, an item is added to the queue, the value of REAR is increased by 1 that is,
REAR=REAR +1.
A
B
C
D
.
N-1
N
FRONT=1
, and REAR = 4
Suppose that we want to insert an element ITEM into a queue at the time the queue
does occupy the last part of the array that is , when REAR=N and the queue is not
yet filled. To do this, we can assume that the queue is circular, that is, that QUEUE(1)
comes after QUEUE(N) in the array. With this assumption, insert ITEM into the queue
by assigning ITEM to QUEUE(1). Instead of increasing REAR to N+1, we reset REAR=1
and then assign QUEUE[REAR]=ITEM.
Similarly, if FRONT=N and element of queue is deleted, we reset FRONT=1 instead of
increasing FRONT to N+1.
If the queue is empty, we assign FRONT=REAR=NULL.
EXAMPLE
52
The following example shows how a queue may be maintained by a circular array
QUEUE with N=5 memory locations
a.) Initially empty:FRONT=0
1
REAR=0
2
3
4
5
b.) A and B and C inserted: FRONT= 1
A
1
REAR=3
c.) A deleted
B
2
C
3
4
5
FRONT =2
B
1
REAR=3
2
C
3
4
5
Stacks
Another way of storing data is in a stack. A stack is a linear structure in which items may
be added or removed only at one end for example, a stack of dishes, stack of pennies .
Only two (2) operations can be carried out on a stack. A stack is generally implemented
with only two principle operations (apart from a constructor and destructor methods):
Push
Pop
adds an item to a stack
extracts the most recently pushed item from the stack
returns the item at the top without removing it
determines whether the stack has anything in it
53
Other methods such as
Top
isempty
natural and efficient one (In most operating systems, allocation and de-allocation of
memory is a relatively expensive operation, there is a penalty for the flexibility of linked
tlist implementations.).
Stack Frames
The data structure containing all the data (arguments, local variables, return address, etc) needed
each time a procedure or function is called.
Almost invariably, programs compiled from modern high level languages (even C!) make use of
a stack frame for the working memory of each procedure or function invocation. When any
procedure or function is called, a number of words - the stack frame - is pushed onto a program
stack. When the procedure or function returns, this frame of data is popped off the stack.
As a function calls another function, first its arguments, then the return address and finally space
for local variables is pushed onto the stack. Since each function runs in its own "environment" or
context, it becomes possible for a function to call itself - a technique known as recursion. This
capability is extremely useful and extensively used - because many problems are elegantly
specified or solved in a recursive way.
Program stack after executing a pair of mutually
recursive functions:
function f(int x, int y) {
int a;
if ( term_cond ) return ...;
a = .....;
return g(a);
}
function g(int z) {
int p,q;
p = ...; q = ...;
return f(p,q);
}
Note how all of function f
and g's environment (their
parameters and local variables) are found in the stack
frame. When f is called a second time from g, a new
frame for the second invocation of f is created.
Key terms
p u s h , p op
Generic terms for adding something to, or removing something from a stack
context
55
The environment in which a function executes: includes argument values, local variables
and global variables. All the context except the global variables is stored in a stack frame.
A number of programming languages are stack-oriented, meaning they define most basic
operations (adding two numbers, printing a character) as taking their arguments from the stack,
and placing any return values back on the stack. For example, PostScript has a return stack and
an operand stack, and also has a graphics state stack and a dictionary stack.
Forth uses two stacks, one for argument passing and one for subroutine return addresses. The use
of a return stack is extremely commonplace, but the somewhat unusual use of an argument stack
for a human-readable programming language is the reason Forth is referred to as a stack-based
language.
Many virtual machines are also stack-oriented, including the p-code machine and the Java virtual
machine..
Almost all computer runtime memory environments use a special stack (the "call stack") to hold
information about procedure/function calling and nesting in order to switch to the context of the
called function and restore to the caller function when the calling finishes. They follow a runtime
protocol between caller and callee to save arguments and return value on the stack. Stacks are an
important way of supporting nested or recursive function calls. This type of stack is used
implicitly by the compiler to support CALL and RETURN statements (or their equivalents) and
is not manipulated directly by the programmer.
Some programming languages use the stack to store data that is local to a procedure. Space for
local data items is allocated from the stack when the procedure is entered, and is deallocated
when the procedure exits. The C programming language is typically implemented in this way.
Using the same In computer science, a stack is an abstract data type and data structure based on
the principle of Last In First Out (LIFO). Stacks are used extensively at every level of a modern
computer system. For example, a modern PC uses stacks at the architecture level, which are used
in the basic design of an operating system for interrupt handling and operating system function
calls. Among other uses, stacks are used to run a Java Virtual Machine, and the Java language
itself has a class called "Stack", which can be used by the programmer. The stack is ubiquitous.
56
As an abstract data type, the stack is a container of nodes and has two basic operations: push and
pop. Push adds a given node to the top of the stack leaving previous nodes below. Pop removes
and returns the current top node of the stack. A frequently used metaphor is the idea of a stack of
plates in a spring loaded cafeteria stack. In such a stack, only the top plate is visible and
accessible to the user, all other plates remain hidden. As new plates are added, each new plate
becomes the top of the stack, hiding each plate below, pushing the stack of plates down. As the
top plate is removed from the stack, they can be used, the plates pop back up, and the second
plate becomes the top of the stack. Two important principles are illustrated by this metaphor: the
Last In First Out principle is one; the second is that the contents of the stack are hidden. Only the
top plate is visible, so to see what is on the third plate, the first and second plates will have to be
removed. This can also be written as FILO-First In Last Out, i.e. the record inserted first will be
popped out at last.
Operations
In modern computer languages, the stack is usually implemented with more operations than just
"push" and "pop". The length of a stack can often be returned as a parameter. Another helper
operation top (also known as peek or peak) can return the current top element of the stack
without removing it from the stack.
This section gives pseudocode for adding or removing nodes from a stack, as well as the length
and top functions. Throughout we will use null to refer to an end-of-list marker or sentinel value,
which may be implemented in a number of ways using pointers.
record Node {
data // The data being stored in the node
next // A reference to the next node; null for last node
}
record Stack {
Node stackPointer
// points to the 'top' node; null for an empty stack
}
function push(Stack stack, Element element) { // push element onto stack
new(newNode)
// Allocate memory to hold new node
newNode.data
:= element
newNode.next
:= stack.stackPointer
stack.stackPointer := newNode
}
function pop(Stack stack) { // increase the stack pointer and return 'top'
node data
57
12+4*3+
The expression is evaluated from the left to right using a stack:
•
•
•
push when encountering an operand and
pop two operands and evaluate the value when encountering an operation.
push the result
Like the following way (the Stack is displayed after Operation has taken place):
Input Operation Stack
1
2
+
4
*
3
+
Push operand 1
Push operand 1, 2
Add
3
Push operand 3, 4
Multiply
12
Push operand 12, 3
Add
15
The final result, 15, lies on the top of the stack at the end of the calculation.
example : implementation in pascal. using marked sequential file as data archives.
{
programmer : clx321
file : stack.pas
unit : Pstack.tpu
}
program TestStack;
{this program use ADT of Stack, i will assume that the unit of ADT of Stack
has already existed}
uses
PStack;
{ADT of STACK}
60
{dictionary}
const
mark = '.';
var
data : stack;
f : text;
cc : char;
ccInt, cc1, cc2 : integer;
{functions}
IsOperand (cc : char) : boolean;
{JUST Prototype}
{return TRUE if cc is operand}
ChrToInt (cc : char) : integer;
{JUST Prototype}
{change char to integer}
Operator (cc1, cc2 : integer) : integer;
{JUST Prototype}
{operate two operands}
{algorithms}
begin
assign (f, cc);
reset (f);
read (f, cc); {first elmt}
if (cc = mark) then
begin
writeln ('empty archives !');
end
else
begin
repeat
if (IsOperand (cc)) then
begin
ccInt := ChrToInt (cc);
push (ccInt, data);
end
else
begin
pop (cc1, data);
pop (cc2, data);
push (data, Operator (cc2, cc1));
end;
read (f, cc);
{next elmt}
until (cc = mark);
end;
close (f);
end.
Runtime stack for both data and procedure calls has important security implications (see below)
of which a programmer must be aware in order to avoid introducing serious security bugs into a
program.
61
Security
Some computing environments use stacks in ways that may make them vulnerable to security
breaches and attacks. Programmers working in such environments must take special care to
avoid the pitfalls of these implementations.
For example, some programming languages use a common stack to store both data local to a
called procedure and the linking information that allows the procedure to return to its caller. This
means that the program moves data into and out of the same stack that contains critical return
addresses for the procedure calls. If data is moved to the wrong location on the stack, or an
oversized data item is moved to a stack location that is not large enough to contain it, return
information for procedure calls may be corrupted, causing the program to fail.
Malicious parties may attempt to take advantage of this type of implementation by providing
oversized data input to a program that does not check the length of input. Such a program may
copy the data in its entirety to a location on the stack, and in so doing it may change the return
addresses for procedures that have called it. An attacker can experiment to find a specific type of
data that can be provided to such a program such that the return address of the current procedure
is reset to point to an area within the stack itself (and within the data provided by the attacker),
which in turn contains instructions that carry out unauthorized operations.
This type of attack is a variation on the buffer overflow attack and is an extremely frequent
source of security breaches in software, mainly because some of the most popular programming
languages (such as C) use a shared stack for both data and procedure calls, and do not verify the
length of data items. Frequently programmers do not write code to verify the size of data items,
either, and when an oversized or undersized data item is copied to the stack, a security breach
may occur.
62
W EEK 7 :
Properties linear Array.
These weeks Learning outcomes :
•Define linear Array.
•Discuss various operations that can be performed on ordered list.
Linear Arrays
Data structures are classified as either linear or nonlinear. It is said to be linear if its
elements form a sequence , that is, a linear list, otherwise nonlinear for example, trees
and graphs , records .They are mainly used to represent data containing a hierarchical
relationship between elements.. Strings, array lists, and queues are linear types of data
structure.The operations normally performed on linear lists include:
a.) Traversal: processing each element in the list.
b.) Search: finding the location of the element with a given value or the record with
a given key.
c.) Insertion: adding a new element to the list.
d.) Deletion: removing an element from the list.
e.) Sorting: arranging the elements in some type of order.
f.) Merging: combining two lists into a single list.
A linear array is a list of a finite number, n, of homogeneous data elements, where the
number n of elements is called the length or size of the array. The elements array A ,
may be denoted as follws:
A1, A2, ..An or A(1), A(2), A(n) or A[1],A[2], A[n]
Where 1 is the the lower bound, LB of the array and n, the upperbound, UB of the array.
Example:
Let DATA be a 5-element linear array of integers such that DATA[1]=24,
DATA[2]=56. DATA[3]=405. DATA[4]=35, DATA[5]=87
The array DATA is fryquently pictured as either of the following:
DATA
DATA[1] 24
DATA[2]
56
DATA[3] 405
DATA[4] 35
63
DATA[5] 87
24
1
56 405 35
2
3
4
87
5
DATA
MULTIDIMENTIONAL ARRAYS
The linear arrays earlier discussed are also one-dimensional arrays, since each element in
the array is referenced by a single subscript. Most programming languages allow 2-
dimentional and 3-dimentional arrays, some allow the number of dimensions for an array
to be as high as 7.
Two-dimensionnal arrays are called Matrices in mathematics and tables in business
applications.
A 2-dimensional m*n array is a
Element is specified by a pair
property that :
1 J m
e.g. the element of A with first
A(j.k) or A[j, k]
collection of m.n data elements such that each
of integers (such as J,K), called subscripts, with the
,and 1 K n
subscript J and second subscript K will be denoted by
Example
Suppose each student in a class of 10 students in a given 3 tests. Assuming the students
are numbered according, the test scores can be assigned to a 10 *3 matrix array
SCORE. Thus, SCORE[K,L] contains the Kth student's score on Lth test.
This can be represented as
Student Test 1
1
2
3
4
5
6
56
78
.
.
.
.
follows:
Test 2 Test 3
46
90
.
.
.
.
90
98
.
.
.
.
64
7
8
9
10
.
.
.
78
.
.
.
82
.
.
.
85
The Array Structure
The array is probably the most widely used data structure; in some languages it is even the only
one
available. An array consists of components which are all of the same type, called its base type; it
is
therefore called a homogeneous structure. The array is a random-access structure, because all
components
can be selected at random and are equally quickly accessible. In order to denote an individual
component,
the name of the entire structure is augmented by the index selecting the component. This index is
to be an
integer between 0 and n-1, where n is the number of elements, the size, of the array.
TYPE T = ARRAY n OF T0
Examples
TYPE Row = ARRAY 4 OF REAL
TYPE Card = ARRAY 80 OF CHAR
TYPE Name = ARRAY 32 OF CHAR
A particular value of a variable
VAR x: Row
with all components satisfying the equation xi = 2-i, may be visualized as shown in Fig. 1.2.
Fig. 1.2 Array of type Row with xi = 2-i
An individual component of an array can be selected by an index. Given an array variable x, we
denote an array selector by the array name followed by the respective component's index i, and
we write xi or x[i].
Because of the first, conventional notation, a component of an array component is therefore also
called a subscripted variable.
The common way of operating with arrays, particularly with large arrays, is to selectively update
single components rather than to construct entirely new structured values. This is expressed by
considering an
array variable as an array of component variables and by permitting assignments to selected
components, such as for example x[i] := 0.125. Although selective updating causes only a single
component value to change, from a conceptual point of view we must regard the entire
composite value as having changed too.
The fact that array indices, i.e., names of array components, are integers, has a most important
65
consequence: indices may be computed. A general index expression may be substituted in place
of an index constant; this expression is to be evaluated, and the result identifies the selected
component. This
generality not only provides a most significant and powerful programming facility, but at the
same time it also gives rise to one of the most frequently encountered programming mistakes:
The resulting value may be outside the interval specified as the range of indices of the array. We
will assume that decent computing systems provide a warning in the case of such a mistaken
access to a non-existent array component.
The cardinality of a structured type, i. e. the number of values belonging to this type, is the
product of the cardinality of its components. Since all components of an array type T are of the
same base type T0, we obtain
card(T) = card(T0)n
x 0 1 .0
x 1 0 .5
x 2 0 .2 5
x 3 0 .1 2 5
19
Constituents of array types may themselves be structured. An array variable whose components
are again
arrays is called a matrix. For example,
M: ARRAY 10 OF Row
is an array consisting of ten components (rows), each constisting of four components of type
REAL, and is
called a 10 × 4 matrix with real components. Selectors may be concatenated accordingly, such
that Mij and M[i][j] denote the j th component of row Mi, which is the i th component of M. This
is usually abbreviated as M[i, j] and in the same spirit the declaration
M: ARRAY 10 OF ARRAY 4 OF REAL
can be written more concisely as M: ARRAY 10, 4 OF REAL.
If a certain operation has to be performed on all components of an array or on adjacent
components of a section of the array, then this fact may conveniently be emphasized by using the
FOR satement, as shown in the following examples for computing the sum and for finding the
maximal element of an array declared as
VAR a: ARRAY N OF INTEGER
s um : = 0;
FOR i := 0 TO N-1 DO sum := a[i] + sum END
k := 0; max := a[0];
FOR i := 1 TO N-1 DO
IF max < a[i] THEN k := i; max := a[k] END
END.
In a further example, assume that a fraction f is represented in its decimal form with k-1 digits,
i.e., by an
array d such that
f = S i : 0 i < k: di * 10-i or
f = d0 + 10*d1 + 100*d2 + + dk-1*10k-1
66
Now assume that we wish to divide f by 2. This is done by repeating the familiar division
operation for all
k-1 digits di, starting with i=1. It consists of dividing each digit by 2 taking into account a
possible carry
from the previous position, and of retaining a possible remainder r for the next position:
r := 10*r +d[i]; d[i] := r DIV 2; r := r MOD 2
This algorithm is used to compute a table of negative powers of 2. The repetition of halving to
compute 2-1,
2-2, ... , 2-N is again appropriately expressed by a FOR statement, thus leading to a nesting of
two FOR
statements.
PROCEDURE Power(VAR W: Texts.Writer; N: INTEGER);
(*compute decimal representation of negative powers of 2*)
VAR i, k, r: INTEGER;
d: ARRAY N OF INTEGER;
BEGIN
FOR k := 0 TO N-1 DO
Texts.Write(W, "."); r := 0;
FOR i := 0 TO k-1 DO
r := 10*r + d[i]; d[i] := r DIV 2; r := r MOD 2;
Texts.Write(W, CHR(d[i] + ORD("0")))
END ;
d[k] := 5; Texts.Write(W, "5"); Texts.WriteLn(W)
END
END Power.
The resulting output text for N = 10 is
20
.5
.2 5
.1 2 5
.0 6 2 5
.0 3 1 2 5
.0 1 5 6 2 5
.0 0 7 8 1 2 5
.0 0 3 9 0 6 2 5
.0 0 1 9 5 3 1 2 5
.0 0 0 9 7 6 5 6 2 5
A representation of an array structure is a mapping of the (abstract) array with components of
type T onto the store which is an array with components of type BYTE. The array should be
mapped in such a way that the computation of addresses of array components is as simple (and
therefore as efficient) as possible. The address i of the j-th array component is computed by the
linear mapping function i = i0 + j*s
where i0 is the address of the first component, and s is the number of words that a component
occupies.
67
Assuming that the word is the smallest individually transferable unit of store, it is evidently
highly desirable that s be a whole number, the simplest case being s = 1. If s is not a whole
number (and this is the normal case), then s is usually rounded up to the next larger integer S.
Each array component then occupies
S words, whereby S-s words are left unused (see Figs. 1.5 and 1.6). Rounding up of the number
of words
needed to the next whole number is called padding. The storage utilization factor u is the
quotient of the
minimal amounts of storage needed to represent a structure and of the amount actually used:
u = s / (s rounded up to nearest integer)
68
W EEK 8 :
Linked list.
• Define linked list .
• Define linked list and compare it with linear list.
• Discuss the advantages and disadvantages of linked
l i st .
Lists
Linked list is an algorithm for storing a list of items. It is made of any number of pieces of
memory (nodes) and each node contains whatever data you are storing along with a pointer (a
link) to another node. By locating the node referenced by that pointer and then doing the same
with the pointer in that new node and so on, you can traverse the entire list.
Because a linked list stores a list of items, it has some similarities to an array. But the two are
implemented quite differently. An array is a single piece of memory while a linked list contains
as many pieces of memory as there are items in the list. Obviously, if your links get messed up,
you not only lose part of the list, but you ill lose any reference to those items no longer included
in the list (unless you store another pointer to those items somewhere).
Some advantages that a linked list has over an array are that you can quickly insert and delete
items in a linked list. Inserting and deleting items in an array requires you to either make room
for new items or fill the "hole" left by deleting an item. With a linked list, you imply rearrange
those pointers that are affected by the change. linked lists also allow you to have different-sized
nodes in the list. Some disadvantages to linked lists include that hey are quite difficult to sort.
Also, you cannot immediately locate, say, the hundredth element in a linked list the way you can
in an array. Instead, you must traverse the list until you've found the hundredth element.
Again, the array implementation of our collection has one serious drawback: you must know the
maximum number of items in your collection when you create it. This presents problems in
programs in which this maximum number cannot be predicted accurately when the program
starts up. Fortunately, we can use a structure called a linked list to overcome this limitation.
69
Linked lists
The linked list is a very flexible dynamic data structure that is structure which grows or
shrinks as the data they hold changes. Lists, stacks and trees are all dynamic structures.: items
may be added to it or deleted from it at will. A programmer need not worry about how many
items a program will have to accommodate: this allows us to write robust programs which
require much less maintenance. A very common source of problems in program maintenance is
the need to increase the capacity of a program to handle larger collections: even the most
generous allowance for growth tends to prove inadequate over time!
In a linked list, each item is allocated space as it is added to the list. A link is kept with each item
to the next item in the list.
Each node of the list has two elements
1. the item being stored in the list and
2. a pointer to the next item in the list
The last node in the list contains a NULL
pointer to indicate that it is the end or tail of
the list.
As items are added to a list, memory for a node is dynamically allocated. Thus the number of
items that may be added to a list is limited only by the amount of memory available.
Handle for the list
The variable (or handle) which represents the list is simply a pointer to the node at the head of the list.
Adding to a list
The simplest strategy for adding an item to a list is to:
a.
b.
c.
d.
allocate space for a new node,
copy the item into it,
make the new node's next pointer point to the current head of the list and
make the head of the list point to the newly allocated node.
70
This strategy is fast and efficient, but each item is added to the head of the list.
An alternative is to create a structure for the list which contains both head and tail pointers:
struct fifo_list {
struct node *head;
struct node *tail;
};
The code for AddToCollection is now trivially modified to make a list in which the item most
recently added to the list is the list's tail.
The specification remains identical to that used for the array implementation: the max_item
parameter to ConsCollection is simply ignored .
Thus we only need to change the implementation. As a consequence, applications which use this
object will need no changes. The ramifications for the cost of software maintenance are
significant.
The data structure is changed, but since the details (the attributes of the object or the elements of
the structure) are hidden from the user, there is no impact on the user's program.
Points to note:
a. This implementation of our collection can be substituted for the first one with no changes to a
client's program. With the exception of the added flexibility that any number of items may be
added to our collection, this implementation provides exactly the same high level behaviour as
the previous one.
b. The linked list implementation has exchanged flexibility for efficiency - on most systems, the
system call to allocate memory is relatively expensive. Pre-allocation in the array-based
implementation is generally more efficient. More examples of such trade-offs will be found
later.
The study of data structures and algorithms will enable you to make the implementation decision
which most closely matches your users' specifications.
71
W EEK 9 :
Properties of linked list.
This week learning outcomes:
• Explain
types of linked list.
of linked lists.
•Applications
•
Implementation of different operations of linked lists.
Types of linked lists
Linearly linked list
Singly-linked list
The simplest kind of linked list is a singly-linked list (or slist for short), which has one link per
node. This link points to the next node in the list, or to a null value or empty list if it is the final
node.
A singly-linked list containing two values: the value of the current node and a link to the next node
A singly linked list's node is divided into two parts. The first part holds or points to information
about the node, and second part holds the address of next node. A singly linked list travels one
way.
Doubly-linked list
A more sophisticated kind of linked list is a doubly-linked list or two-way linked list. Each
node has two links: one points to the previous node, or points to a null value or empty list if it is
the first node; and one points to the next, or points to a null value or empty list if it is the final
node. A doubly-linked list containing three integer values: the value, the link forward to the next node, and
the link
backward to the previous node
73
In some very low level languages, XOR-linking offers a way to implement doubly-linked lists
using a single word for both links, although the use of this technique is usually discouraged.
Circularly-linked list
In a circularly-linked list, the first and final nodes are linked together. This can be done for both
singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow
the list in either direction until you return to the original node. Viewed another way, circularly-
linked lists can be seen as having no beginning or end. This type of list is most useful for
managing buffers for data ingest, and in cases where you have one object in a list and wish to
iterate through all other objects in the list in no particular order.
The pointer pointing to the whole list may be called the access pointer.A circularly-linked list
containing three integer values
Sentinel nodes
Linked lists sometimes have a special dummy or sentinel node at the beginning and/or at the end
of the list, which is not used to store data. Its purpose is to simplify or speed up some operations,
by ensuring that every data node always has a previous and/or next node, and that every list
(even one that contains no data elements) always has a "first" and "last" node. Lisp has such a
design - the special value nil is used to mark the end of a 'proper' singly-linked list, or chain of
cons cells as they are called. A list does not have to end in nil, but a list that did not would be
termed 'improper'.
Applications of linked lists
Linked lists are used as a building block for many other data structures, such as stacks, queues
and their variations.
The "data" field of a node can be another linked list. By this device, one can construct many
linked data structures with lists; this practice originated in the Lisp programming language,
where linked lists are a primary (though by no means the only) data structure, and is now a
common feature of the functional programming style.
74
Sometimes, linked lists are used to implement associative arrays, and are in this context called
association lists. There is very little good to be said about this use of linked lists; they are easily
outperformed by other data structures such as self-balancing binary search trees even on small
data sets (see the discussion in associative array). However, sometimes a linked list is
dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse
that set.
Linked lists vs. arrays
Array
Linked list
Indexing
O(1) O(n)
Inserting / Deleting at end
O(1) O(1) or O(n)[2]
Inserting / Deleting in middle (with iterator) O(n) O(1)
Persistent
No
Singly yes
Locality
Great Bad
Linked lists have several advantages over arrays. Elements can be inserted into linked lists
indefinitely, while an array will eventually either fill up or need to be resized, an expensive
operation that may not even be possible if memory is fragmented. Similarly, an array from which
many elements are removed may become wastefully empty or need to be made smaller.
Further memory savings can be achieved, in certain cases, by sharing the same "tail" of elements
among two or more lists — that is, the lists end in the same sequence of elements. In this way,
one can add new elements to the front of the list while keeping a reference to both the new and
the old versions — a simple example of a persistent data structure.
75
On the other hand, arrays allow random access, while linked lists allow only sequential access to
elements. Singly-linked lists, in fact, can only be traversed in one direction. This makes linked
lists unsuitable for applications where it's useful to look up an element by its index quickly, such
as heapsort. Sequential access on arrays is also faster than on linked lists on many machines due
to locality of reference and data caches. Linked lists receive almost no benefit from the cache.
Another disadvantage of linked lists is the extra storage needed for references, which often
makes them impractical for lists of small data items such as characters or boolean values. It can
also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new
element, a problem generally solved using memory pools.
A number of linked list variants exist that aim to ameliorate some of the above problems.
Unrolled linked lists store several elements in each list node, increasing cache performance while
decreasing memory overhead for references. CDR coding does both these as well, by replacing
references with the actual data referenced, which extends off the end of the referencing record.
A good example that highlights the pros and cons of using arrays vs. linked lists is by
implementing a program that resolves the Josephus problem. The Josephus problem is an
election method that works by having a group of people stand in a circle. Starting at a
predetermined person, you count around the circle n times. Once you reach the nth person, take
them out of the circle and have the members close the circle. Then count around the circle the
same n times and repeat the process, until only one person is left. That person wins the election.
This shows the strengths and weaknesses of a linked list vs. an array, because if you view the
people as connected nodes in a circular linked list then it shows how easily the linked list is able
to delete nodes (as it only has to rearrange the links to the different nodes). However, the linked
list will be poor at finding the next person to remove and will need to recurse through the list
until it finds that person. An array, on the other hand, will be poor at deleting nodes (or elements)
as it cannot remove one node without individually shifting all the elements up the list by one.
However, it is exceptionally easy to find the nth person in the circle by directly referencing them
by their position in the array.
76
The list ranking problem concerns the efficient conversion of a linked list representation into an
array. Although trivial for a conventional computer, solving this problem by a parallel algorithm
is complicated and has been the subject of much research.
Doubly-linked vs. singly-linked
Double-linked lists require more space per node (unless one uses xor-linking), and their
elementary operations are more expensive; but they are often easier to manipulate because they
allow sequential access to the list in both directions. In particular, one can insert or delete a node
in a constant number of operations given only that node's address. Comparing with singly-linked
lists, it requires the previous node's address in order to correctly insert or delete. Some
algorithms require access in both directions. On the other hand, they do not allow tail-sharing,
and cannot be used as persistent data structures.
Circularly-linked vs. linearly-linked
Circular linked lists are most useful for describing naturally circular structures, and have the
advantage of regular structure and being able to traverse the list starting at any point. They also
allow quick access to the first and last records through a single pointer (the address of the last
element). Their main disadvantage is the complexity of iteration, which has subtle special cases.
Sentinel nodes (header nodes)
Doubly linked lists can be structured without using a front and NULL pointer to the ends of the
list. Instead, a node of object type T set with specified default values is used to indicate the
"beginning" of the list. This node is known as a Sentinel node and is commonly referred to as a
"header" node. Common searching and sorting algorithms are made less complicated through the
use of a header node, as every element now points to another element, and never to NULL. The
header node, like any other, contains a "next" pointer that points to what is considered by the
linked list to be the first element. It also contains a "previous" pointer which points to the last
element in the linked list. In this way, a doubly linked list structured around a Sentinel Node is
circular.
77
The Sentinel node is defined as another node in a doubly linked list would be, but the allocation
of a front pointer is unnecessary as the next and previous pointers of the Sentinel node will point
to itself. This is defined in the default constructor of the list.
next == this; prev == this;
If the previous and next pointers point to the Sentinel node, the list is considered empty.
Otherwise, if one or more elements is added, both pointers will point to another node, and the list
will contain those elements. [3]
Sentinel node may simplify certain list operations, by ensuring that the next and/or previous
nodes exist for every element. However sentinel nodes use up extra space (especially in
applications that use many short lists), and they may complicate other operations. To avoid the
extra space requirement the sentinel nodes can often be reused as references to the first and/or
last node of the list.
The Sentinel node eliminates the need to keep track of a pointer to the beginning of the list, and
also eliminates any errors that could result in the deletion of the first pointer, or any accidental
relocation.
Linked list operations
When manipulating linked lists in-place, care must be taken to not use values that you have
invalidated in previous assignments. This makes algorithms for inserting or deleting linked list
nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from
singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an
end-of-list marker or sentinel, which may be implemented in a number of ways.
Linearly-linked lists
Singly-linked lists
Our node data structure will have two fields. We also keep a variable firstNode which always
points to the first node in the list, or is null for an empty list.
78
record Node {
data // The data being stored in the node
next // A reference to the next node, null for last node
}
record List {
Node firstNode
// points to first node of list; null for empty list
}
Traversal of a singly-linked list is simple, beginning at the first node and following each next
link until we come to the end:
node := list.firstNode
while node not null {
(do something with node.data)
node := node.next
}
The following code inserts a node after an existing node in a singly linked list. The diagram
shows how it works. Inserting a node before an existing one cannot be done; instead, you have to
locate it while keeping track of the previous node.
function insertAfter(Node node, Node newNode) { // insert newNode after node
newNode.next := node.next
node.next
:= newNode
}
Inserting at the beginning of the list requires a separate function. This requires updating
firstNode.
function insertBeginning(List list, Node newNode) { // insert node before
current first node
newNode.next
:= list.firstNode
list.firstNode := newNode
}
Similarly, we have functions for removing the node after a given node, and for removing a node
from the beginning of the list. The diagram demonstrates the former. To find and remove a
particular node, one must again keep track of the previous element.
function removeAfter(node node) { // remove node past this one
obsoleteNode := node.next
node.next := node.next.next
destroy obsoleteNode
}
function removeBeginning(List list) { // remove first node
obsoleteNode := list.firstNode
79
list.firstNode := list.firstNode.next
// point past deleted
node
destroy obsoleteNode
}
Notice that removeBeginning() sets list.firstNode to null when removing the last node in the list.
Since we can't iterate backwards, efficient "insertBefore" or "removeBefore" operations are not
possible.
Appending one linked list to another can be inefficient unless a reference to the tail is kept as
part of the List structure, because we must traverse the entire first list in order to find the tail, and
then append the second list to this. Thus, if two linearly-linked lists are each of length n, list
appending has asymptotic time complexity of O(n). In the Lisp family of languages, list
appending is provided by the append procedure.
Many of the special cases of linked list operations can be eliminated by including a dummy
element at the front of the list. This ensures that there are no special cases for the beginning of
the list and renders both insertBeginning() and removeBeginning() unnecessary. In this case, the
first useful data in the list will be found at list.firstNode.next.
Doubly-linked lists
With doubly-linked lists there are even more pointers to update, but also less information is
needed, since we can use backwards pointers to observe preceding elements in the list. This
enables new operations, and eliminates special-case functions. We will add a prev field to our
nodes, pointing to the previous element, and a lastNode field to our list structure which always
points to the last node in the list. Both list.firstNode and list.lastNode are null for an empty list.
record Node {
data // The data being stored in the node
next // A reference to the next node; null for last node
prev // A reference to the previous node; null for first node
}
record List {
Node firstNode
// points to first node of list; null for empty list
Node lastNode
// points to last node of list; null for empty list
}
80
Iterating through a doubly linked list can be done in either direction. In fact, direction can change
many times, if desired.
Fo r w a r d s
node := list.firstNode
while node null
<do something with node.data>
node := node.next
Backwards
node := list.lastNode
while node null
<do something with node.data>
node := node.prev
These symmetric functions add a node either after or before a given node, with the diagram
demonstrating after:
function insertAfter(List list, Node node, Node newNode)
newNode.prev := node
newNode.next := node.next
if node.next = null
list.lastNode := newNode
else
node.next.prev := newNode
node.next := newNode
function insertBefore(List list, Node node, Node newNode)
newNode.prev := node.prev
newNode.next := node
if node.prev is null
list.firstNode := newNode
else
node.prev.next := newNode
node.prev
:= newNode
We also need a function to insert a node at the beginning of a possibly-empty list:
function insertBeginning(List list, Node newNode)
if list.firstNode = null
list.firstNode := newNode
list.lastNode := newNode
newNode.prev := null
newNode.next := null
else
insertBefore(list, list.firstNode, newNode)
A symmetric function inserts at the end:
81
representation significantly simplifies adding and removing nodes with a non-empty list, but
empty lists are then a special case.
Doubly-circularly-linked lists
Assuming that someNode is some node in a non-empty list, this code iterates through that list
starting with someNode (any node will do):
Fo r w a r d s
node := someNode
do
do something with node.value
node := node.next
while node someNode
Backwards
node := someNode
do
do something with node.value
node := node.prev
while node someNode
Notice the postponing of the test to the end of the loop. This is important for the case where the
list contains only the single node someNode.
This simple function inserts a node into a doubly-linked circularly-linked list after a given
element:
function insertAfter(Node node, Node newNode)
newNode.next := node.next
newNode.prev := node
node.next.prev := newNode
node.next
:= newNode
To do an "insertBefore", we can simply "insertAfter(node.prev, newNode)". Inserting an element
in a possibly empty list requires a special function:
function insertEnd(List list, Node node)
if list.lastNode = null
node.prev := node
node.next := node
else
83
insertAfter(list.lastNode, node)
list.lastNode := node
To insert at the beginning we simply "insertAfter(list.lastNode, node)". Finally, removing a node
must deal with the case where the list empties:
function remove(List list, Node node)
if node.next = node
list.lastNode := null
else
node.next.prev := node.prev
node.prev.next := node.next
if node = list.lastNode
list.lastNode := node.prev;
destroy node
As in doubly-linked lists, "removeAfter" and "removeBefore" can be implemented with
"remove(list, node.prev)" and "remove(list, node.next)".
Doubly Linked Lists
Doubly linked lists have
a pointer to the
preceding item as well
as one to the next.
They permit scanning or searching of the list in both directions. (To go backwards in a simple list, it is
necessary to go back to the start and scan forwards.) Many applications require searching backwards
and forwards through sections of a list: for example, searching for a common name like "Kim" in a
Korean telephone directory would probably need much scanning backwards and forwards through a
small region of the whole list, so the backward links become very useful. In this case, the node structure
is altered to have two links:
struct t_node {
void *item;
struct t_node *previous;
struct t_node *next;
} node;
84
W E E K 10:
Non- Linear structures.
This week learning outcomes :
• Define a tree .
• State properties of tree.
• Describe different types of tree.( General tree , binary tree)
• Explain binary tree reprentation.
Tree Structures
Basic Concepts and Definitions
Srings, arrays, and queues are linear types of data structure. However, tree is a nonlinear
data structure is mainly used to represent data containing a hierarchical relationship
between elements . Examples of this feature include records, family tree and table of
contents.
Trees
void BST :: clearhelp (NodePtr *P)
{
if (P == NULL) return;
clearhelp (P
clearhelp (P
delete P;
}
A
C
B
E
D
H
G
I
F
Left)
Right);
balance ≅ {left - right}
85
X = Node [Pivot].Left;
Y = Node [X].Right;
Node {Pivot}.Left = Node [Y].Right;
Node [X].Right = Node [Y].Left;
Node [Y].Left = X;
Node [Y].Right = Pivot;
Pivot = Y;
Subcase # 1
if Node [Pivot].BF = 0 then
{
Node [Node [Pivot].Left].BF = 0
Node [Node [Pivot].Right].BF = 0
}
Subcase # 2
else if Node [Pivot].BF = 1 then
{
Node [Pivot].BF = 0
Node [Node [Pivot].Left].BF = 1
Node [Node [Pivot].Right].BF = -1
}
Subcase # 3
else Node [Pivot].BF = 0
{
87
Node [Pivot].BF = 0
Node [Node [Pivot].Left].BF = +1
Node [Node [Pivot].Right].BF = 0
Priority Ques:
1. Linked List
Insert by priority order.
2 . A rray
Sort it and rearrange the elements.
3. Queue - Search
4. Heap Sort - Partial Ordered tree.
5. Array of Queues - Small & Priorities.
The priority of Node V is not greater than its children.
Trees that have lots of dependents (general trees)
3
5
6
10
18
89
9
10
Divide & Conquer
O (log(n))
88
Left pointer points to a list of children.
Right pointer points to a list of siblings.
Other way to implement this is Array of Pointers.
In-Order Traversal:
Traverse forest
•
of first tree inorder.
Visit root of first tree.
• Visit forest of remaining trees in order.
Algorithm:
void intra (Ptr K)
{
if ( R == NULL ) then
return
else
{
in t r a ( R
R
child );
Info >> cout;
sib );
in t r a ( R
}
}
89
W E E K 11:
Non- Linear structures.
This week learning outcomes :
• Describe binary tree.
• Analysis of a complete tree .
Binary Trees
The simplest form of tree is a binary tree. A binary tree consists of
a. a node (called the root node) and
b. left and right sub-trees.
Both the sub-trees are themselves binary trees.
You now have a recursively defined data structure. (It is also possible to define a list recursively:
can you see how?)
A binary tree
The nodes at the lowest levels of the tree (the ones with no sub-trees) are called leaves.
90
h = floor( log2n )
Examination of the Find method shows that in the worst case, h+1 or ceiling( log2n )
comparisons are needed to find an item. This is the same as for binary search.
However, Add also requires ceiling( log2n ) comparisons to determine where to add an item.
Actually adding the item takes a constant number of operations, so we say that a binary tree
requires O(logn) operations for both adding and finding an item - a considerable improvement
over binary search for a dynamic structure which often requires addition of new items.
Deletion is also an O(logn) operation.
General binary trees
However, in general addition of items to an ordered tree will not produce a complete tree. The worst
case occurs if we add an ordered list of items to a tree.
What will happen? Think before you click here!
This problem is readily overcome: we use a structure known as a heap. However, before looking
at heaps, we should formalise our ideas about the complexity of algorithms by defining carefully
what O(f(n)) means.
Key terms
Root Node
Node at the "top" of a tree - the one from which all operations on the tree commence. The root
node may not exist (a NULL tree with no nodes in it) or have 0, 1 or 2 children in a binary tree.
Leaf Node
Node at the "bottom" of a tree - farthest from the root. Leaf nodes have no children.
Complete Tree
Tree in which each leaf is at the same distance from the root. A more precise and formal
definition of a complete tree is set out later.
Height
Number of nodes which must be traversed from the root to reach a leaf of a tree.
92
W E E K 12:
Non- Linear structures.
This week learning outcomes :
• Binary tree structure revisited .
• State properties of tree.
• Basic Operations on Binary Trees.
Binary Tree
A binary tree T , is defined as a finite set of elements called nodes, such that
a.) T is empty (called the null tree or empty tree) or
b.) T contains a distinguished node R, called the root of T, and the remaining nodes
of T form an order pair of disjoint binary trees T1 and T2.
If T contains a root R, then the two trees T1 and T2 are called respectively, the left and
right sub-tree of R , if T1 is nonempty, then its root is called the left successor of R ,
similarly if T2 is nonempty, then its root is called the right successor of R.
A
B
C
D
A
G
A
H
A
J
K
A
A left-downward slanted line from a node N indicates a left successor of N, and a
right-downward slanted line from N indicates a right successor of N . Observe that :
i.) B is a left successor and C is a right successor of the node A.
i i .)
The left sub-tree of the root A consists of the nodes B,D,E and F .
iii.)The right sub-tree of A consists of the nodes C,G,H,J,K ,,and L.
93
Any node N in a binary tree T has either 0,1, or 2 successors. The nodes A,B,C and H
have 2 successors, the nodes E and J have only one successor, and the nodes D,F,G,L,
and K have no successors. The nodes with zero successors are called terminal nodes.
Terminology
Suppose N is a node in T with left successor S1 and right successor S2 then N is
called the parent (or father) of S1 and S2 . Analogously, S1 is called the left child
( or son ) of N and S2 is called the right child (or son). Furthermore, S1 and S2 are said to
be siblings (or brothers). Every node N in a binary tree T, except the root has a unique
parent, called the predecessor of N.
The terms, descendant and ancestor have their usual meaning. That is , a node L is
called a descendant of node N (and N is called an ancestor of L) if there is a succession
of children from N to L. In particular, L is called a left or right descendant of N
according to whether L belongs to the left or right sub-tree of N . The line drawn
from a node N of T to a successor is called an edge, and a sequence of consecutive
edges is called a path. A terminal node is called a leaf, and path ending in a leaf is
called a branch. Each node in a binary tree T is assigned a level number as follows:
The root R of the tree T is assigned the level number 0 , and every other node is assigned
a level number which is 1 more than the level number of its parent . Those nodes
with the same level are said to belong to the same generation .
The depth (or height) of a tree T is the maximum number of nodes in a branch of T. This
turns ont to be 1 more than the largest level number of T. Binary tree T and T1 are said
to be similar if they have the same structure or in other words, if they have the same
shape . The trees are said to be copies if they are similar and if they have the same
contents at corresponding nodes.
TRAVERSING BINARY TREES
Basic Operations on Binary Trees
There are many tasks that may have to be perfomed on a tree structure; a common one is that of
executing a given operation P on each element of the tree. P is then understood to be a parameter
of the more general task of visting all nodes or, as it is usually called, of tree traversal. If we
consider the task as a single sequential process, then the individual nodes are visited in some
specific order and may be considered as being laid out in a linear arrangement. In fact, the
description of many algorithms is considerably facilitated if we can talk about processing the
next element in the tree based in an underlying order. There are three principal orderings that
emerge naturally from the structure of trees. Like the treestructure itself, they are conveniently
expressed in recursive terms. Referring to the binary tree .
Let R denote the root and A and B denote the left and right subtrees, the three orderings are
1. Preorder: R, A, B (visit root before the subtrees)
2. Inorder: A, R, B
3. Postorder: A, B, R (visit root after the subtrees)
In other words, there are three (3) standard ways of travasing a binary tree T with root
R. These three algorithms are called preorder, inorder, and postorder.
94
Preorder:
a.) Process the root R.
b.) Traverse the left sub-tree of R in preorder.
c.) Traverse the right sub-tree of R in Preorder.
Inorder:
a.) Traverse the left sub-tree of R inorder.
b.) Process the root R .
c.) Traverse the right sub-tree of R inorder .
Postorder:
a.) Traverse the left sub-tree of R in preorder.
b.) Traverse the right sub-tree of R in Postorder.
c.) Process the root R.
Traverse the following binary tree using:
i.) Preorder algorithm
ii.) Inorder algorithm
iii.)Postorder algorithm.
1
2
3
4
5
6
7
10
8
9
Solution:
i.) Preorder algorithm: 12489510367
ii.) Inorder algorithm:
?
iii.)Postorder algorithm: ?
95
W E E K 13:
Sorting
This week Learning outcomes :
• Define sorting.
• Explain various categories of sorting.
Sorting
Sorting is one of the most important operations performed by computers. In the days of magnetic
tape storage before modern data-bases, it was almost certainly the most common operation
performed by computers as most "database" updating was done by sorting transactions and
merging them with a master file. It's still important for presentation of data extracted from
databases: most people prefer to get reports sorted into some relevant order before wading
through pages of data!
Sorting is generally understood to be the process of rearranging a given set of objects in a
specific order. The purpose of sorting is to facilitate the later search for members of the sorted
set. As such it is an almost universally performed, fundamental activity. Objects are sorted in
telephone books, in income tax files, in tables of contents, in libraries, in dictionaries, in
warehouses, and almost everywhere that stored objects have to be searched and retrieved. Even
small children are taught to put their things "in order", and they are confronted with some sort of
sorting long before they learn anything about arithmetic.
Hence, sorting is a relevant and essential activity, particularly in data processing. What else
would be easier to sort than data! Nevertheless, our primary interest in sorting is devoted to the
even more fundamental techniques used in the construction of algorithms. There are not many
techniques that do not occur somehere in connection with sorting algorithms. In particular,
sorting is an ideal subject to demonstrate a great diversity of algorithms, all having the same
purpose, many of them being optimal in some sense, and most of them having advantages over
others. It is therefore an ideal subject to demonstrate the necessity of
performance analysis of algorithms. The example of sorting is moreover well suited for showing
how a very significant gain in performance may be obtained by the development of sophisticated
algorithms when obvious methods are readily available.
Different categories of sorting
The dependence of the choice of an algorithm on the structure of the data to be processed -- an
ubiquitous phenomenon -- is so profound in the case of sorting that sorting methods are generally
classified into two categories, namely:
i.) sorting of arrays, and
ii.) sorting of (sequential) files.
The two classes are often called internal and external sorting because arrays are
stored in the fast, high-speed, random-access "internal" store of computers and files
are appropriate on the slower, but more spacious "external" stores based on
mechanically moving devices (disks and tapes).
96
W E E K 14:
Different types of sorting techniques.
This week Learning outcomes :
• Explain Sorting by insertion.
• Explain Sorting by insertion.
• Explain Sorting by exchange.
Different types of sorting techniques.
Sorting methods that sort items in situ can be classified into three principal categories according
to their underlying method:
Sorting by insertion
Sorting by selection
Sorting by exchange
These three pinciples will now be examined and compared. The procedures operate on a global
variable a whose components are to be sorted in situ, i.e. without requiring additional, temporary
storage. The components are the keys themselves. We discard other data represented by the
record type Item, thereby simplifying matters. In all algorithms to be developed in this chapter,
we will assume the presence of an array a and a constant n, the number of elements of a:
TYPE Item = INTEGER;
VAR a: ARRAY n OF Item
Sorting by Straight Insertion
This method is widely used by card players. The items (cards) are conceptually divided into a
destination sequence a1 ... ai-1 and a source sequence ai ... an. In each step, starting with i = 2
and incrementing i by unity, the i th element of the source sequence is picked and transferred into
the destination sequence by inserting it at the appropriate place.
47
Initial Keys: 44 55 12 42 94 18 06 67
i=1 44 55 12 42 94 18 06 67
i=2 12 44 55 42 94 18 06 67
i=3 12 42 44 55 94 18 06 67
i=4 12 42 44 55 94 18 06 67
i=5 12 18 42 44 55 94 06 67
i=6 06 12 18 42 44 55 94 67
i=7 06 12 18 42 44 55 67 94
A Sample Process of Straight Insertion Sorting.
The process of sorting by insertion is shown in an example of eight numbers chosen at random .
The algorithm of straight insertion is as follows:
FOR i := 1 TO n-1 DO
x := a[i];
97
a[R] := x
END
END BinaryInsertion
Sorting by Straight Selection
This method is based on the following principle:
1. Select the item with the least key.
2. Exchange it with the first item a0.
3. Then repeat these operations with the remaining n-1 items, then with n-2 items, until only one
item -- the largest -- is left.
This method is shown on the same eight keys as given above.
Initial keys 44 55 12 42 94 18 06 67
06 55 12 42 94 18 44 67
06 12 55 42 94 18 44 67
06 12 18 42 94 55 44 67
06 12 18 42 94 55 44 67
49
06 12 18 42 44 55 94 67
06 12 18 42 44 55 94 67
06 12 18 42 44 55 67 94
A Sample Process of Straight Selection Sorting.
The algorithm is formulated as follows:
FOR i := 0 TO n-1 DO
assign the index of the least item of ai ... an-1 to k;
exchange ai with ak
END
This method, called straight selection, is in some sense the opposite of straight insertion: Straight
insertion
considers in each step only the one next item of the source sequence and all items of the
destination array to
find the insertion point; straight selection considers all items of the source array to find the one
with the least
key and to be deposited as the one next item of the destination sequence..
PROCEDURE StraightSelection;
VAR i, j, k: INTEGER; x: Item;
BEGIN
FOR i := 0 TO n-2 DO
k := i; x := a[i];
FOR j := i+1 TO n-1 DO
IF a[j] < x THEN k := j; x := a[k] END
END ;
a[k] := a[i]; a[i] := x
END
END StraightSelection
99
W E E K 15:
Different sorting and searching .
This week Learning outcomes :
• Explain various sorting techniques.
• Explain linear and binary search algorithm.
Different types of sorting revisited:
Bubble, Selection, Insertion Sorts
There are a large number of variations of one basic strategy for sorting. It's the same strategy that
you use for sorting your bridge hand. You pick up a card, start at the beginning of your hand and
find the place to insert the new card, insert it and move all the others up one place.
/* Insertion sort for integers */
void insertion( int a[], int n ) {
/* Pre-condition: a contains n items to be sorted */
int i, j, v;
/* Initially, the first item is considered 'sorted' */
/* i divides a into a sorted region, x<i, and an
unsorted one, x >= i */
for(i=1;i<n;i++) {
/* Select the item at the beginning of the
as yet unsorted section */
v = a[i];
/* Work backwards through the array, finding where v
should go */
j = i;
/* If this element is greater than v,
move it up one */
while ( a[j-1] > v ) {
a[j] = a[j-1]; j = j-1;
if ( j <= 0 ) break;
}
/* Stopped when a[j-1] <= v, so put v at position j */
a[j] = v;
}
}
Bubble Sort
Another variant of this procedure, called bubble sort, is commonly taught:
/* Bubble sort for integers */
#define SWAP(a,b)
{ int t; t=a; a=b; b=t; }
void bubble( int a[], int n )
/* Pre-condition: a contains n items to be sorted */
103
{
int i, j;
/* Make n passes through the array */
for(i=0;i<n;i++)
{
/* From the first element to the end
of the unsorted section */
for(j=1;j<(n-i);j++)
{
/* If adjacent items are out of order, swap them */
if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]);
}
}
Analysis
Eahch of these algorithms requires n-1 passthes: each pass places one item in its correct place. (The
nt is then in the correct place also.) The i pass makes either ior n - i comparisons and moves.
S o:
or O(n2) - but we already know we can use heaps to get an O(n logn) algorithm. Thus these
algorithms are only suitable for small problems where their simple code makes them faster than
the more complex code of the O(n logn) algorithm. As a rule of thumb, expect to find an O(n
logn) algorithm faster for n>10 - but the exact value depends very much on individual machines!.
They can be used to squeeze a little bit more performance out of fast sort algorithms
Quick Sort
Quicksort is a very efficient sorting algorithm invented by C.A.R. Hoare. It has two phases:
•
•
the partition phase and
the sort phase.
As we will see, most of the work is done in the partition phase - it works out where to divide the
work. The sort phase simply sorts the two smaller problems that are generated in the partition
phase.
This makes Quicksort a good example of the divide and conquer strategy for solving problems.
(You've already seen an example of this approach in the binary search procedure.) In quicksort,
we divide the array of items to be sorted into two partitions and then call the quicksort procedure
recursively to sort the two partitions, ie we divide the problem into two smaller ones and conquer
by solving the smaller ones. Thus the conquer part of the quicksort routine looks like this:
104
The pseudocode for a procedure to search the table for a given course code is as follows:
Procedure SEACH_Table
Begin
Subscript=0
Code_found = false
Repeat
Subscript =subscript + 1
If course[substript].cource_code=Item_sought
Then code_found = true
Endif
Until code_found=true or subscript=no_of_elements
Or courses[subscript].course_code > item_sought
End procedure
The procedure sets a boolean variable code_founf to true if the course code is found.
Binary Search
However, if we place our items in an array and sort them in either ascending or descending order on the
key first, then we can obtain much better performance with an algorithm called binary search.
Binary search is a technique for searching an ordered list in which we first check the middle item and -
based on that comparison - "discard" half the data. The same procedure is then applied to the remaining
half until a match is found or there are no more items left.
That is In binary search, we first compare the key with the item in the middle position of the
array. If there's a match, we can return immediately. If the key is less than the middle key, then
the item sought must lie in the lower half of the array; if it's greater then the item sought must lie
in the upper half of the array. So we repeat the procedure on the lower (or upper) half of the
array.
106
static reduces the visibility of a function an should be used wherever possible to control
access to functions!
Analysis
Each step of the algorithm divides
the block of items being searched
in half. We can divide a set of n
items in half at most log2 n times.
Thus the running time of a
binary search is proportional to
log n and we say this is a O(log
n) algorithm.
108
However, if our collection is large and dynamic, ie items are being added and deleted
continually, then we can obtain considerably better performance using a data structure
called a tree.
Note that Big Oh Is a notation formally describing the set of all functions which
are bounded above by a nominated function.
110