Data Structure & Algorithms
Data Structure & Algorithms
COURSE DETAILS
Course Writer/Developer: Dr. A.T. Akinwale
&
Miss A. J. Ikuomola
Department of Computer Science
College of Natural Science
University of Agriculture Abeokuta,
Ogun State, Nigeria
COURSE CONTENT
Mathematical notation and function, overview of data structure, Hash Function, Linked
List, Array, Stacks, Queues, Recursion
COURSE REQUIREMENTS
This is a compulsory course for all students in the Department of Computer Science,
Mathematics and Statistic. In view of this, students are expected to participate in all the
activities and have a minimum of 75% attendance to be able to write the final
examination.
READING LIST
Goodrich M.T. and Tamassia R. Data Structures and Algorithm in Java, 4th Edition
Lipschutz S. (1986).Schaum’s Outline of Theory and Problems of Data Structures.
McGraw-Hill Book Company
Storer J.A. (2002). An introduction to Data Structure and Algorithms.
å a j and
j =1
åa
j =m
j
Example:
n
(1) åa
i =1
i = a1 + a 2 + a3 + a 4 + ... + K + a n
n
(2) åa b
i =1
i i = a1b1 + a 2 b2 + a3 b3 + a 4 b4 + ... + K + a n bn
5
(3) åj
j =2
2
= 2 2 + 3 2 + 4 2 + 5 2 = 4 + 9 + 16 + 25 = 54
n
(4) å j = 1+ 2 + 3 + 4 + ... + n
j =1
PIE (Product)
n
p xi = x1 .x 2 . x3 ... x n
i =1
Floor Function
Let x be any real number, then x lies between two integers called the floor and the ceiling of x.
Specifically,
ëx û , called the floor of x denotes greatest integer that does not exceed x.
Examples:
(1) ë3.14 û = 3
(2) ë 5û = 2.23 = 2
(3) ë- 8.5û = -9
(4) ë7û = 7
Ceiling Function
The symbol for ceiling function is ( é ù ) called the ceiling function of x denotes the least integer
that is not less than x.
Example:
(1) é3.14ù = 4
(2) 5 = 2.23 = 3
(3) é- 8.5ù = -8
(4) é7ù = 7
k (mod M)
(read k modulo M) will denote the integer remainder when k is divided by M. More exactly
k (mod M) is the unique integer r such that
(1) 25(mod7)
25/7 = 3 r 4
25(mod7) = 4
(2) 25(mod5)
25/5 = 5 r 0
25(mod5) = 0
(3) 35(mod11)
35/11 = 3 r 2
35(mod11) = 2
(4) 3(mod8)
3/8 = 3 r 4
3(mod8) = 3 (note that 3 = 8 . 0 + 3 = 3) when q= 0
UNIT 2: DATA STRUCTURE
Introduction
Data structure is a particular way of storing and organizing data in a computer so that it can be
used efficiently. Data structure is the logical arrangement of data element with the set of
operation that is needed to access the element. The logical model or mathematical model of the
particular organization of data is called a data structure. It is defined as a set of rules and
constraint which shows the relationship that exist between individual pieces of data which may
occur.
Basic Principle
Data structures are generally based on the ability of a computer to fetch and store data at any
place in its memory, specified by an address – a bit string that can be stored in memory and
manipulated by the program. Thus the record and array data structures are based on computing
the addresses of data items with arithmetic operations; while the linked data structures are based
on storing addresses of data items within the structure itself. Many data structures use both
principles.
The choice of a data structure for a particular problem depends on the following factors:
1) Volume of data involved
2) Frequency and ways in which data will be used.
3) Dynamic and static nature of the data.
4) Amount of storage required by the data structure.
5) Time to retrieve an element.
6) Ease of programming.
Transversal/Transversing: accessing each element or record in the list exactly only, so that
certain items in the record may be processed. This accessing and processing is sometimes called
“visiting” the record.
Search (Searching): finding the location of the record with a given key value or finding the
location of all records which satisfy one or more conditions.
Inserting: adding a new record to the structure
Deleting: removing an element from the list of records from the structure.
Sorting: arranging the record in some logical order (e.g. alphabetically according to some
NAME key or in numerical order according to some NUMBER key such as social security
number, account number, matric number, etc.)
Merging: combining the records in two different sorted file into a single sorted file.
Quick inserts
Quick deletes
(Tree always remains balanced)
(Similar trees good for disk storage)
Graph Best models real-world situations Some algorithms are slow and very
complex
UNIT 3: HASH FUNCTION
A hash function is any well defined procedure or mathematical function that converts a large,
possibly variably sized amout of data into small datum usually a single integer that may serve as
an index to an array. The value returns by hash function are called Hash value, Hash Codes,
Hash sums or simply Hashes.
1. Division Method
Choose a number m larger than the number n of keys in K (the number m is usually chosen to be
a prime number or a number without small division, since these frequently minimizes the
number of collision). Then the hash function H is denoted by;
H(k) = k (mod m) or H(k) = k (mod m) + 1
The first formula k (mod m) denotes the remainder when k is divided by m while the second
formular is used when we want the hash address to range from 1 to m rather than from 0 to m-1.
Example:
Suppose a company with 68 employees assign a 4 - digit employee number to each employee
which is used as the primary key in the company’s employee file. Suppose L consist of 100 two-
digit addresses 00, 01, 02, …, 99. Applying the hash function to each of the following employee
numbers: 3205, 7148, 2345.
Solution:
Using division method, choose a prime number in which is close to 99 such as m = 97. Then
H(k) = k (mod m)
a) H(3205) = 3205(mod97) = 3205/97 = 4 H(3205) = 4
That is, dividing 3205 by 97 gives a remainder of 4.
b) H(k) = k (mod m)
H(7148) = 7148(mod97) = 7148/97 = 67 H(7148) = 64
That is, dividing 7148 by 97 gives a remainder of 64.
c) H(k) = k (mod m)
H(2345) = 2345(mod97) = 2345/97 = 17 H(2345) = 17
That is, dividing 2345 by 97 gives a remainder of 17.
2. Midsquare
The key k is square. Then the hash function H is defined by
H(k) = l
where l is obtained by deleting digits from both ends of k2. Note that the same position of k2
must be used for all of the keys.
Example
Using the above equation
Solution
The following calculations are performed:
k 3205 7148 2345
H(k) 72 93 99
Observe that the fourth and fifth digits, counting from the right, are chosen for the hash address .
3. Folding Method
The key k is partitioned into a number of parts k1, …, kr, where each parts, except possibly the
last, has the same number of digits as the required address. Then the parts are added together,
ignoring the last carry. That is,
H(k) = k1 + k2 + k3 +…+ kr
where the leading-digit carries, if any, are ignored. Sometimes, for extra “milling”, the even-
numbered parts k2, k4, …, are each reversed before the addition.
Example:
Chopping the key k into two parts and adding yields of the following hash addresses:
H(3205) = 32 + 05 = 37
H(7148) = 71 + 48 = 119 = 19
H(2345) = 23 + 45 = 68
Observe that the the leading digit 1 of H(7148) is ignored. Alternatively, one may want to
reverse the second parts before adding, this producing the following hash addresses:
H(3205) = 32 + 50 = 82
H(7148) = 71 + 84 = 155 = 55
H(2345) = 23 + 54 = 77
Basic Concepts
This is a data structure that consist of a sequence of data record such that in each record there is a
field that contain a reference to the next field
→ [Info link]
A node is made up of two parts which are the data field and link-list.
Each record of a link-list is called a NODE which is made up of two parts the information part
and the pointer part.
Linear List
1 ● → 99 ● → 37 X
In linear linked list, the components are all linked together in some sequential manner.
Circular List
12 ● → 99 ● → 37 ●
Singly, doubly and multiply linked list are example of a linked list:
Singly-linked list contain nodes which have a data field as well as a next field, which points to
the next node in the linked list.
In a doubly-linked list, each node contains, besides the next-node link, a second link filed
pointing to the previous node in the sequence. The two links may be called forwars(s) and
backwards.
Let LIST be a linked list. LIST require two linear arrays called INFO and LINK, such that
INFO [K] and LINK [K]contain, respectively, the information part and the next pointer field of a
node of LIST. It should be noted that, LIST requires a variable name such as START which
indicate the beginning of the list and a next-pointer sentinel – denoted by NULL which indicate
the end of the list.
Example
3 O 6
4 T 0
6 □ 11
7 X 10
9 N 3
10 I 4
11 E 7
12
Interpreted as:
START = 9, so INFO [9] = N (is the first character)
LINK [9] = 3, so INFO [3] = O (is the second character)
LINK [3] = 6, so INFO [6] = □ (blank) is the third character
LINK [6] =11, so INFO [11] = E is the fourth character
LINK [11] = 7, so INFO [7] = X is the fifth character
LINK [7] = 10, so INFO [10] = I is the sixth character
LINK [10] = 4, so INFO [4] = T is the seventh character
LINK [4] = 0 INFO [0] = NULL value, so the list has ended
In other words, NO EXIT is the character string
Example:
A hospital ward contains 12 beds of which 9 are occupied. The listing is given by pointer field
START
5
Bed Patient
Number
1 Kunle 7
3 Daniel 11
4 Micheal 12
5 Ade 3
7 Lanre 4
8 Gabriel 1
9 Samuel 0
10
11 Femi 8
12 Nike 9
START
●
1
ALG
11 74 14
2
3
82 0
4
5 84 12
5
78 0
6
GEOm 74 8
7
100 13
8
10
88 2
11
62 7
12
13 74 6
14 93 4
15
16
ALG
11 88 2 74 4 93 4 82 NIL
The information for ALG is 88, 74, 93, 82
FOR GEOM
5 8 1 6 7 7 8 10 13 74 6 78 NI
The information for geom. is 84, 62, 74, 100, 74, and 78
Searching
Algorithm 2:
List is a linked list in memory. This algorithm finds the location LOC of node where ITEM first
appear in LIST or sets LOC = NULL
(1) Set PTR : = START
(2) Repeat step 3 while PTR NULL
(3) If ITEM = INFO [PTR] , then
9 N B 0 6 1 E 7 X 10 I 4 7 X
؞LOC = 7 [i.e location of X = 7
UNIT 5: ARRAY
Linear Array
This is a list of finite number of n of homogeneous data element ( i.e data element of the same
type) such that:
a) The elements of the array are reference representation by an index set consisting of n-
consecutive numbers.
b) The element of the array is stored respectively in successive memory location. Number n
of element is called the length or size of the array.
In general, the length or the numbers of the data element can be obtained by the index set
of the formular:
Length = UB – LB+1 or length = UB – LB+1
UB = larger index called Upper Bound
LB = smallest index called Lower Bound of the Array.
NB: length = UB when LB=1
The element of an array can be denoted by A1, A2, - - - - - - An
Example:
Let data is a six element linear array of integer such that:
DATA [1] = 247 DATA [2] = 56 DATA [3] =429
DATA [4] = 135 DATA [5] = 87 DATA [6] =156
DATA 247, 56, 429, 135, 87
This type of array data can be pictured in the form:
DATA
DATA
1 247
OR 247 56 429 135 87 156
2 56
3 429
4 135
5 87
6 156
Example 2:
An automobile company uses an array AUTO to record the number of automobile sold each
year from 1932-1984
Solution:
AUTO [K] = number of automobile sold in the years.
Lower Bound = LB = 1932
Upper Bound = UB = 1984
Length = UB – LB+1
= 1984 -1932+1
Length = 53
1001
1002
1003
1004
'
'
Fig. 1
Example 3:
Consider the array also AUTO in example 2 which record the number of automobile sold each
year from 1932 through 1984. Suppose AUTO appear in memory as picture in fig. (2) i.e base
AUTO = 200 and w=4 word per memory cell for AUTO.
200
201
203
204
206
207
208
210
Fig. (2)
Algorithm
Transversing a linear Array
1.Repeat for K= LB+UB
2.Apply PROCESS to LA[K]
[End of loop]
3.Exit.
Example 4:
Consider example 2, find the number NUM of year during which more than 300 automobile
were sold.
Solution: using the algorithm
1) Set NUM := 0 [initialize counter]
2) Repeat for K = 1932 to 1984
If Auto [K] ˃300; then set NUM: = NUM+1
End of loop
3) Loop.
Algorithm:
(Deleting from a Linear Array) DELETE (LA, N, K, ITEM)
Here LA is a Linear Array with N element and K is positive integer such that K≤ N. This
algorithm deletes the kth element from LA
1.Set ITEM := LA[K]
2.Repeat for J = K to N-1
Set LA [J]:= LA [J+1]
[End of loop]
3.Set N:= N-1
4.Exit.
Example:
MULTIDIMENTIONAL ARRAYS
Two dimensional Array mxn arrays A is a collection of m.n data elements such that each element
is specified by a part of integers (such as J, K) called subscripts with the property that 1≤ J≤ M
and 1≤ K≤n
The element of A with first subscript J and second subscript K will be denoted by Aj.K of A [J,
K].
Two dimensional arrays are sometimes called (matrices) matrix array.
Column
A [1, 1], A [1, 2], A [1, 3], A [1, 4]
Row A [2, 1], A [2, 2], A [2, 3], A [2, 4]
A [3, 1], A [3, 2], A [3, 3], A [3, 4]
Two dimensional 3X4 Array
REPRESENTATION OF TWO DIMENSIONAL ARRAYS IN MEMORY
Matrix can be represented in two ways:
1.Column Major Order: 2. Row Major Order sub script:
A subscript
(1, 1)
(1, 1) (1, 2)
(2, 1) column 1 (3, 1) Row 1
(3, 1) (2, 2)
(1, 2) (1, 3)
(2, 2) column 2 (3, 3) Row 2
(3, 2) (2, 4)
(1, 3) (1, 1)
(2, 3) column 3 (1, 1)
(3, 3) (1, 1)
(1, 4) (1, 1) Row 3
(2, 4) column 4 (1, 1)
(3, 4)
UNIT 6: STACKS, QUEUES, RECURSION
A Stack is a linear structure in which items may be added or removed only at one end.
Examples of such a structure: a stack of dishes, a stack of pennies and a stack of folded towels.
Observe that an item may be added or removed only from the top of any of the stacks.
STACKS
A Stack is an element in which an element may be inserted or deleted only at one end,
called the top of the stack. This means, in particular, that elements are removed from a stack in
the reverse order of that in which they were inserted into the stack.
Special terminology is used for two basic operations associated with stacks:
(a) “Push” is the term used to insert an element into a stack.
(b) “Pop” is the term used to delete an element from a stack.
We emphasize that these terms are used only with stacks, not with other data structures.
Examples:
Suppose the following 6 elements are pushed, in order, onto an empty stack:
AAA, BBB, CCC, DDD, EEE, FFF
Fig. 2 shows three ways of picturing such a stack. For notational convenience, we will
frequently designate the stack by writing:
Stack: AAA, BBB, CCC, DDD, EEE, FFF
The implication is that the right –most elements is the top element. We emphasized that,
regardless of the way a stack is described, is underlying property is that insertion and deletion
can occur only at the top of the stack. This means EEE cannot be deleted before FFF is deleted,
DDD cannot be deleted before EEE and FFF are deleted, and so on. Consequently, the elements
may be popped from the stack only in the reverse order of that in which they were pushed onto
the stack.
Top 1 AAA
2
3 BBB
FFF Top 4
EEE 5 CCC
DDD 6
CCC 7 DDD
BBB 8
AA A 9 EEE
:
( a) N-1 FFF
N
(b)
1 2 3 4 5 6 7 8 9 … N-1 N
Top
(c)
Fig 2 Diagram of stacks
STACK
Fig 3 XXX YYY ZZZ
Top MAXSTK
3 8
The operation of adding (pushing) an item onto a stack and the operation of removing (popping)
an item from a stack may be implemented, respectively, by the following procedures, called
PUSH and POP. In executing the procedure PUSH, one must first test whether there is room in
the stack for the new item if not, then we have the condition known as overflow. Analogous, in
executing the procedure POP, one must first test whether there is an element in the stack to be
deleted; if not, then we have the condition known as underflow.