Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
Fundamentals of Data Structures in C - , 2 - Ellis Horowitz, Sahni, Dinesh Mehta
A Set of Instructions
Data Structures + Algorithms
Data Structure = A Container stores Data
Algoirthm = Logic + Control
Functions of Data Structures
Add
Index
Key
Position
Priority
Get
Change
Delete
Common Data Structures
Array
Stack
Queue
Linked List
Tree
Heap
Hash Table
Priority Queue
How many Algorithms?
Countless
Algorithm Strategies
Greedy
Divide and Conquer
Dynamic Programming
Exhaustive Search
Which Data Structure or Algorithm
is better?
Must Meet Requirement
High Performance
Low RAM footprint
Easy to implement
Encapsulated
Chapter 1 Basic Concepts
Overview: System Life Cycle
Algorithm Specification
Data Abstraction
Performance Analysis
Performance Measurement
1.1 Overview: system life cycle (1/2)
Good programmers regard large-scale
computer programs as systems that
contain many complex interacting parts.
As systems, these programs undergo a
development process called the system
life cycle.
1.1 Overview (2/2)
We consider this cycle as consisting of
five phases.
Requirements
Analysis: bottom-up vs. top-down
Design: data objects and operations
Refinement and Coding
Verification
Program Proving
Testing
Debugging
1.2 Algorithm Specification (1/10)
1.2.1 Introduction
An algorithm is a finite set of instructions that
accomplishes a particular task.
Criteria
input: zero or more quantities that are externally supplied
output: at least one quantity is produced
definiteness: clear and unambiguous
finiteness: terminate after a finite number of steps
effectiveness: instruction is basic enough to be carried out
A program does not have to satisfy the finiteness criteria.
1.2 Algorithm Specification
(2/10)
Representation
A natural language, like English or Chinese.
A graphic, like flowcharts.
A computer language, like C.
Algorithms + Data structures =
Programs [Niklus Wirth]
Sequential search vs. Binary search
1.2 Algorithm Specification
(3/10)
Example 1.1 [Selection sort]:
Example 1.1 [Selection sort]:
From those integers that are currently unsorted, find the
smallest and place it next in the sorted list.
i [0] [1] [2] [3] [4]
- 30 10 50 40 20
0 10 30 50 40 20
1 10 20 40 50 30
2 10 20 30 40 50
3 10 20 30 40 50
1.2 (4/10)
Program 1.3 contains
a complete program
which you may run on
your computer
1.2 Algorithm Specification
(5/10)
Example 1.2 [Binary search]:
[0] [1] [2] [3] [4] [5] [6]
8 14 26 30 43 50 52
left right middle list[middle] : searchnum
0 6 3 30 < 43
4 6 5 50 > 43
4 4 4 43 == 43
0 6 3 30 > 18
0 2 1 14 < 18
2 2 2 26 > 18
2 1 -
j=1 abc
lv1 SWAP: i=1, j=2 abc
lv2 perm: i=2, n=2 acb
print: acb
lv1 SWAP: i=1, j=2 acb
lv0 SWAP: i=0, j=0 abc
lv0 SWAP: i=0, j=1 abc
lv1 perm: i=1, n=2 bac
lv1 SWAP: i=1, j=1 bac
lv2 perm: i=2, n=2 bac
print: bac
lv1 SWAP: i=1, j=1 bac
lv1 SWAP: i=1, j=2 bac
lv2 perm: i=2, n=2 bca
print: bca
lv1 SWAP: i=1, j=2 bca
lv0 SWAP: i=0, j=1 bac
lv0 SWAP: i=0, j=2 abc
lv1 perm: i=1, n=2 cba
lv1 SWAP: i=1, j=1 cba
lv2 perm: i=2, n=2 cba
print: cba
lv1 SWAP: i=1, j=1 cba
lv1 SWAP: i=1, j=2 cba
lv2 perm: i=2, n=2 cab
print: cab
lv1 SWAP: i=1, j=2 cab
lv0 SWAP: i=0, j=2 cba
1.3 Data abstraction (1/4)
Data Type
A data type is a collection of objects and a set of
operations that act on those objects.
For example, the data type int consists of the objects {0,
+1, -1, +2, -2, …, INT_MAX, INT_MIN} and the operations
+, -, *, /, and %.
The data types of C
The basic data types: char, int, float and double
The group data types: array and struct
The pointer data type
The user-defined types
1.3 Data abstraction (2/4)
Abstract Data Type
An abstract data type(ADT) is a data type
that is organized in such a way that
the specification of the objects and
the operations on the objects is separated from
::= is defined as
1.4 Performance analysis (1/17)
Criteria
Is it correct?
Is it readable?
…
Performance Analysis (machine independent)
space complexity: storage requirement
time complexity: computing time
Performance Measurement (machine dependent)
1.4 Performance analysis (2/17)
1.4.1 Space Complexity:
S(P)=C+SP(I)
Fixed Space Requirements (C)
Independent of the characteristics
of the inputs and outputs
instruction space
space for simple variables, fixed-size structured
variable, constants
Variable Space Requirements (SP(I))
depend on the instance characteristic I
number, size, values of inputs and outputs
associated with I
recursive stack space, formal parameters, local
variables, return address
1.4 Performance analysis (3/17)
Examples:
Example 1.6: In program 1.9, Sabc(I)=0.
Ssum(I)=Ssum(n)=6n
1.4 Performance analysis (5/17)
1.4.2 Time Complexity:
T(P)=C+TP(I)
The time, T(P), taken by a program, P, is the
sum of its compile time C and its run (or
execution) time, TP(I)
Fixed time requirements
Compile time (C), independent of instance
characteristics
Variable time requirements
Run (execution) time TP
TP(n)=caADD(n)+csSUB(n)+clLDA(n)+cstSTA(n)
1.4 Performance analysis (6/17)
A program step is a syntactically or
semantically meaningful program segment
whose execution time is independent of the
instance characteristics.
Example
(Regard as the same unit machine independent)
abc = a + b + b * c + (a + b - c) / (a + b) + 4.0
abc = a + b + c
Methods to compute the step count
Introduce variable count into programs
Tabular method
Determine the total number of steps contributed by
each statement step per execution frequency
add up the contribution of all statements
1.4 Performance analysis (7/17)
Iterative summing of a list of numbers
*Program 1.12: Program 1.10 with count statements (p.23)
float sum(float list[ ], int n)
{
float tempsum = 0; count++; /* for assignment */
int i;
for (i = 0; i < n; i++) {
count++; /*for the for loop */
tempsum += list[i]; count++; /* for assignment
*/
}
count++; /* last execution of for */
count++; /* for return */ 2n + 3 steps
return tempsum;
}
1.4 Performance analysis (8/17)
Tabular Method
*Figure 1.2: Step count table for Program 1.10 (p.26)
Iterative function to sum a list of numbers
steps/execution
Statement s/e Frequency Total steps
float sum(float list[ ], int n) 0 0 0
{ 0 0 0
float tempsum = 0; 1 1 1
int i; 0 0 0
for(i=0; i <n; i++) 1 n+1 n+1
tempsum += list[i]; 1 n n
return tempsum; 1 1 1
} 0 0 0
Total 2n+3
1.4 Performance analysis (9/17)
Recursive summing of a list of numbers
*Program 1.14: Program 1.11 with count statements added (p.24)
f(n) = 10n2+4n+2
10n2+4n+2 <= 11n2, for all n >= 5, 10n2+4n+2 = (n2)
10n2+4n+2 >= n2, for all n >= 1, 10n2+4n+2 = (n2)
n2 <= 10n2+4n+2 <= 11n2, for all n >= 5, 10n2+4n+2 = (n2)
Example: 1232 2
1234 3
1-dimension array addressing 1236 4
int one[] = {0, 1, 2, 3, 4};
Goal: print out address and value
void print1(int *ptr, int rows){
/* print out a one-dimensional array using a pointer */
int i;
printf(“Address Contents\n”);
for (i=0; i < rows; i++)
printf(“%8u%5d\n”, ptr+i, *(ptr+i));
printf(“\n”);
}
2.2 Structures and Unions (1/6)
2.2.1 Structures (records)
Arrays are collections of data of the same type. In C
there is an alternate way of grouping data that permit
the data to vary in type.
This mechanism is called the struct, short for structure.
A structure is a collection of data items, where each
item is identified as to its type and name.
2.2 Structures and Unions (2/6)
Create structure data type
We can create our own structure data types by using
the typedef statement as below:
person1.sex_info.sex = male;
person1.sex_info.u.beard =
FALSE;
and
person2.sex_info.sex = female;
person2.sex_info.u.children = 4;
2.2 Structures and Unions (5/6)
2.2.3 Internal implementation of structures
The fields of a structure in memory will be stored in
the same way using increasing address locations in
the order specified in the structure definition.
Holes or padding may actually occur
Within a structure to permit two consecutive components to
be properly aligned within memory
The size of an object of a struct or union type is the
amount of storage necessary to represent the largest
component, including any padding that may be
required.
2.2 Structures and Unions (6/6)
2.2.4 Self-Referential Structures
One or more of its components is a pointer to itself.
A and B, represented
as above to obtain D
= A + B.
To produce D(x), padd
(Program 2.5) adds A(x)
and B(x) term by term.
Analysis: O(n+m)
where n (m) is the number
of nonzeros in A (B).
2.3 The polynomial ADT (10/10)
sparse matrix
data structure?
5*3
15/15 8/36 6*6
2.4 The sparse matrix ADT (2/18)
The standard representation of a matrix is a two
dimensional array defined as a[MAX_ROWS]
[MAX_COLS].
We can locate quickly any element by writing a[i ][ j ]
Sparse matrix wastes space
We must consider alternate forms of representation.
Our representation of sparse matrices should store only
nonzero elements.
Each element is characterized by <row, col, value>.
2.4 The sparse matrix ADT (3/18)
Structure 2.3
contains our
specification of
the matrix ADT.
A minimal set of
operations
includes matrix
creation,
addition,
multiplication,
and transpose.
2.4 The sparse matrix ADT (4/18)
We implement the Create operation as
below:
2.4 The sparse matrix ADT (5/18)
Figure 2.4(a) shows how the sparse matrix of Figure
2.3(b) is represented in the array a.
Represented by a two-dimensional array.
Each element is characterized by <row, col, value>.
# of rows (columns) # of nonzero terms
transpose
row, column in
ascending order
2.4 The sparse matrix ADT (6/18)
2.4.2 Transpose a Matrix
For each row i
take element <i, j, value> and store it in element <j, i, value> of
the transpose.
difficulty: where to put <j, i, value>
(0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)
Move elements down very often.
For all elements in column j,
place element <i, j, value> in element <j, i, value>
2.4 The sparse matrix ADT (7/18)
This algorithm is incorporated in transpose
(Program 2.7).
columns
elements
columns
elements
2.4 The sparse matrix ADT (10/18)
After the execution of the third for loop, the
values of row_terms and starting_pos are:
[0] [1] [2] [3] [4]
[5]
row_terms = 2 1 2 2 0 1
starting_pos = 1 3 4 6 8 8
transpose
2.4 The sparse matrix ADT (11/18)
2.4.3 Matrix multiplication
Definition:
Given A and B where A is mn and B is np, the
product matrix D has dimension mp. Its <i, j>
element is n 1
d ij aik bkj
k 0
for 0 i < m and 0 j < p.
Example:
2.4 The sparse matrix ADT (12/18)
Sparse Matrix Multiplication
Definition: [D]m*p=[A]m*n* [B]n*p
Procedure: Fix a row of A and find all elements
in column j of B for j=0, 1, …, p-1.
Alternative 1.
Scan all of B to find all elements in j.
Alternative 2.
Compute the transpose of B.
(Put all column elements consecutively)
Once we have located the elements of row i of A and column
j of B we just do a merge operation similar to that used in the
polynomial addition of 2.2
2.4 The sparse matrix ADT (13/18)
General case:
dij=ai0*b0j+ai1*b1j+…+ai(n-1)*b(n-1)j
Array A is grouped by i, and after transpose,
array B is also grouped by j
a Sa d Sd
b Sb e Se
c Sc f Sf
g Sg
a×b
2.4 The sparse matrix ADT (16/18)
2.4 The sparse matrix ADT (17/18)
Analyzing the algorithm
cols_b * termsrow1 + totalb +
cols_b * termsrow2 + totalb +
…+
cols_b * termsrowp + totalb
= cols_b * (termsrow1 + termsrow2 + … + termsrowp)+
rows_a * totalb
= cols_b * totala + row_a * totalb
local variables
old frame pointer fp old frame pointer
stack frame of invoking function
return address return address
main
system stack before a1 is invoked system stack after a1 is invoked
(a) (b)
*Figure 3.2: System stack after function call a1 (p.103)
3.1 The stack ADT (3/5)
The ADT specification of the stack is shown in
Structure 3.1
3.1 The stack ADT (4/5)
Implementation: using array
3.1 The stack ADT (5/5)
3.2 The queue ADT (1/7)
A queue is an ordered list in which all insertion take
place one end, called the rear and all deletions
take place at the opposite end, called the front
If we insert the elements A, B, C, D, E, in that
order, then A is the first element we delete from the
queue
A stack is also known as a First-In-First-Out (FIFO)
list
3.2 The queue ADT (2/7)
The ADT specification of the queue appears in
Structure 3.2
3.2 The queue ADT (3/7)
Implementation 1:
using a one dimensional array and
two variables, front and rear
3.2 The queue ADT (4/7)
problem: there may be available space when IsFullQ is true i.e. movement is
required.
3.2 The queue ADT (5/7)
Example 3.2 [Job scheduling]:
Figure 3.5 illustrates how an operating system might
process jobs if it used a sequential representation for
its queue.
As jobs enter and leave the system, the queue gradually shift
to right.
In this case, queue_full should move the entire queue to the
left so that the first element is again at queue[0], front is at -1,
and rear is correctly positioned.
Shifting an array is very time-consuming, queue_full has a
worst case complexity of O(MAX_QUEUE_SIZE).
We can obtain a more efficient
3.2 (6/7) representation if we regard the array
queue[MAX_QUEUE_SIZE] as
circular.
Implementation 2:
regard an array as a circular
queue
front: one position
counterclockwise from the
first element
rear: current end
array 1 1 1 0 1 1 1 1 0 1 1 0 1 1 0 0 1
1 1 1 0 1 0 0 1 0 1 1 1 1 1 1 1 1
The entrance is at 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 1
position [1][1] and the exit at [m][p] 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1
1 0 0 1 1 0 1 1 0 1 1 1 1 1 0 1 1
1 1 1 0 0 0 1 1 0 1 1 0 0 0 0 0 1
1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1
1 0 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1
exit 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3.3 A Mazing Problem (2/8)
If X marks the spot of our current location,
maze[row][col], then Figure 3.9 shows the
possible moves from this position
3.3 A Mazing Problem (3/8)
A possible implementation:
Predefinition: the possible directions to move in an
array, move, as in Figure 3.10.
Obtained from Figure 3.9
typedef struct {
short int vert;
short int horiz;
} offsets;
offsets move[8]; /*array of moves for each direction*/
If we are at position, maze[row][col], and we wish to find
the position of the next move, maze[row][col], we set:
next_row = row + move[dir].vert;
next_col = col + move[dir].horiz;
3.3 A Mazing Problem (4/8)
Initial attempt at a maze traversal algorithm
maintain a second two-dimensional array, mark, to record
#define MAX_STACK_SIZE
100 /*maximum stack size*/
typedef struct {
short int row;
short int col;
short int dir;
} element;
element
stack[MAX_STACK_SIZE];
R: row
R1C1D1 C: col
R4 C12
R3 C14 D 5
2
D: dir
R3 C13 D 6
3
Pop out
R2 C12 D 3
R2 C11 D 2
maze[1][1]: entrance Initially set mark[1][1]=1
R1 C10 D 3
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R1C9D2 1 0 1 0 0 0 1 1 0 0 0 1 1 1 1 1 1 █ ██ █ █ █ █ █
█ ██ ██ ██ █ █
R1C8D2 1 1 0 0 0 1 1 0 1 1 1 0 0 1 1 1 1 █ █ █
█ █ ██ █ █ ██ █ █ █ █
█ █
R2C7D1 1 0 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 █ █ █ █ █ █ █
█
█ █ █ █ █ █ █ █ █
1 1 1 0 1 1 1 1 0 1 1 0 1 1 0 0 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
█ █
R3C6D1
1 1 1 0 1 0 0 1 0 1 1 1 1 1 1 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R3C5D2 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R2C4D3 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
1 0 0 1 1 0 1 1 0 1 1 1 1 1 0 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R1C5D5
1 1 1 0 0 0 1 1 0 1 1 0 0 0 0 0 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R1C4D2 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R1C3D2 1 0 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
R2C2D1
R1C1D3
1 maze[15][11]: exit
3.3 A Mazing Problem (6/8)
Review of add and delete to a stack
3.3 A Mazing
Problem (7/7)
Analysis:
The worst case of
computing time of path
is O(mp), where m and
p are the number of
rows and columns of
the maze 0
respectively 7 N 1
6 W E 2
5 S 3
4
(1,1)
(m,p)
(m+2)*(p+2)
3.3 A Mazing Problem (8/8)
The size of a stack
0 0 0 0 0 1
1 1 1 1 1 0
1 0 0 0 0 1
0 1 1 1 1 1
1 0 0 0 0 1
1 1 1 1 1 0
1 0 0 0 0 1
0 1 1 1 1 1
1 0 0 0 0 0 m*p mp m / 2 / p, or p / 2 / m
Postfix:
no
parentheses,
no precedence
3.4 Evaluation of Expressions
(5/14)
Evaluating postfix expressions is much simpler
than the evaluation of infix expressions:
There are no parentheses to consider.
To evaluate an expression we make a single left-to-
right scan of it.
We can evaluate
an expression
easily by using
a stack
Figure 3.14 shows this
processing when the
input is nine character
string 6 2/3-4 2*+
3.4 Evaluation of Expressions (6/14)
Representation
We now consider the representation of both the stack and
the expression
3.4 Evaluation of Expressions
(7/14)
Get Token
3.4 (8/14)
Evaluation of
Postfix Expression
3.4 Evaluation of Expressions
(9/14)
string: 6 2/3-4 2*+
6 2 / 3 - 4 2 * +
2 2
thean
not answer
notoperator,
an
is an anisoperator,
operator,
not
is operator,
an operator,
not an
not
isoperator,
an
an
isend
operator,
operator,
an of
operator,
string,
put into
put into
pop
thepop
put
stack
the
twointo
two
stack
elements
put
the
elements
into
put
pop
stackthe
into
pop
two
pop
stack
the
two
elements
thestack
elements
stack 1 4*2
2
3
4
6/2-3 + 4*2
of the
of the
stackstack of the
ofand
stack
theget
stack
answer
0 6/2-3+4*2
6/2-3
6/2
6
top
STACK
now, top must -1
+1
-2
3.4 Evaluation of Expressions
(10/14)
We can describe am algorithm for producing a postfix
expression from an infix one as follows
(1) Fully parenthesize expression
a/b-c+d*e-a*c
((((a / b) - c) + (d * e)) - (a * c))
(2) All operators replace their corresponding right
parentheses
((((a / b) - c) + (d * e)) - (a * c))
* * two passes
/ - + -
(3)Delete all parentheses
ab/c-de*+ac*-
The order of operands is the same in infix and postfix
3.4 Evaluation of Expressions
(11/14)
Example 3.3 [Simple expression]: Simple expression a+b*c, which
yields abc*+ in postfix.
icp isp
12
12
13 13
13
0
13 13
20 0
0
12 12
12
19 match )13
13 13
13
0 13
3.4 Evaluation of Expressions
(12/14)
Algorithm to convert from infix to postfix
Assumptions:
operators: (, ), +, -, *, /, %
operands: single digit integer or variable of one character
1. Operands are taken out immediately
2. Operators are taken out of the stack as long as their in-stack
precedence (isp) is higher than or equal to the incoming
precedence (icp) of the new operator
3. ‘(‘ has low isp, and high icp
op ( ) + - * / %
eos
Isp 0 19 12 12 13 13 13
0
Icp 20 19 12 12 13 13 13
0
precedence stack[MAX_STACK_SIZE];
/* isp and icp arrays -- index is value of precedence lparen,
rparen, plus, minus, times, divide, mod, eos */
static int isp [ ] = {0, 19, 12, 12, 13, 13, 13, 0};
3.4 Evaluation of Expressions
(13/14)
a * ( b + c ) / d
operand,
operator
operator
operand,
operator
operand,
operator
operator
operand,
eos
print out print out print out print out
pushthe
pop into
stack
the stack
and printout 2 +
the isp of“)”,
operator “+“
“( “pop
“/ is 13
12
0 and
and
andprint
the
theicp
out
icpofof
until
“*“*
“(“““(”
isis13
20
13
1 (
output 0 */
top
a b c + * d / stack
now, top must -+1
1
3.4 Evaluation of Expressions
(14/14)
Complexity: (n)
The total time spent
here is (n) as the
number of tokens that
get stacked and
unstacked is linear in n
where n is the number
of tokens in the
expression
3.5 MULTIPLE STACKS AND
QUEUE (1/5)
Two stacks
m[0], m[1], …, m[n-2], m[n-1]
bottommost bottommost
stack 1 stack 2
More than two stacks (n)
memory is divided into n equal segments
boundary[stack_no]
0 stack_no < MAX_STACKS
top[stack_no]
0 stack_no < MAX_STACKS
3.5 MULTIPLE STACKS AND
QUEUE (2/5)
Initially, boundary[i]=top[i].
All stacks are empty and divided into roughly equal segments.
3.5 MULTIPLE STACKS AND
QUEUE (3/5)
a*(p.128)
#define MEMORY_SIZE 100 /* size of memory */
#define MAX_STACK_SIZE 100 /* max number of stacks plus 1 */
/* global memory declaration */
element memory [MEMORY_SIZE];
int top [MAX_STACKS];
int boundary [MAX_STACKS];
int n; /* number of stacks entered by the user */
*(p.129) To divide the array into roughly equal segments :
top[0] = boundary[0] = -1;
for (i = 1; i < n; i++)
top [i] =boundary [i] =(MEMORY_SIZE/n)*i;
boundary [n] = MEMORY_SIZE-1;
3.5 MULTIPLE STACKS AND
QUEUE (4/5)
*Program 3.12:Add an item to the stack stack-no (p.129)
void add (int i, element item) {
/* add an item to the ith stack */
if (top [i] == boundary [i+1])
stack_full (i); may have unused storage
memory [++top [i] ] = item;
}
*Program 3.13:Delete an item from the stack stack-no (p.130)
element delete (int i) {
/* remove top element from the ith stack */
if (top [i] == boundary [i])
return stack_empty (i);
return memory [ top [i]--];
}
3.5 MULTIPLE STACKS AND
QUEUE (5/5)
Find j, stack_no < j < n ( 往右 )
such that top[j] < boundary[j+1]
or, 0 j < stack_no ( 往左 )
b[0] t[0] b[1] t[1] b[i] t[i] t[i+1] t[j] b[j+1] b[n]
b[i+1] b[i+2]
meet
b=boundary, t=top
往左或右找一個空間
*Figure 3.19: Configuration when stack i meets stack i+1, but the memory is not full (p.130)
Chapter 4 Lists
Pointers
Singly Linked Lists
Dynamically Linked Stacks and Queues
Polynomials
Chain
Circularly Linked Lists
Equivalence Relations
Doubly Linked Lists
Pointers (1/5)
Consider the following alphabetized list of three
letter English words ending in at:
(bat, cat, sat, vat)
If we store this list in an array
Add the word mat to this list
move sat and vat one position to the right before we insert mat.
Remove the word cat from the list
move sat and vat one position to the left
Problems of a sequence representation (ordered
list)
Arbitrary insertion and deletion from arrays can be very
time-consuming
Pointers (2/5)
An elegant solution: using linked representation
Items may be placed anywhere in memory.
In a sequential representation the order of elements is
the same as in the ordered list, while in a linked
representation these two sequences need not be the
same.
Store the address, or location, of the next element in
that list for accessing elements in the correct order
with each element.
Thus, associated with each list element is a node
which contains both a data component and a pointer
to the next item in the list. The pointers are often
called links.
Pointers (3/5)
C provides extensive supports for pointers.
Two most important operators used with the pointer
type :
& the address operator
* the dereferencing (or indirection) operator
Example:
If we have the declaration:
int i, *pi;
then i is an integer variable and pi is a pointer to an integer.
If we say:
pi = &i;
then &i returns the address of i and assigns it as the value of pi.
To assign a value to i we can say:
i = 10; or *pi = 10;
Pointers (4/5)
Pointers can be dangerous
Using pointers: high degree of flexibility and efficiency, but
dangerous as well.
It is a wise practice to set all pointers to NULL when they are not
actually pointing to an object.
Another: using explicit type cast when converting between pointer
types.
Example:
pi = malloc(sizeof(int));/*assign to pi a pointer to
int*/
pf = (float *)pi; /*casts int pointer to float pointer*/
In many systems, pointers have the same size as type int.
Since int is the default type specifier, some programmers omit the
return type when defining a function.
The return type defaults to int which can later be interpreted as a
pointer.
Pointers (5/5)
Using dynamically allocated storage
When programming, you may not know how much space
you will need, nor do you wish to allocate some vary large
area that may never be required.
C provides heap, for allocating storage at run-time.
You may call a function, malloc, and request the amount of
memory you need.
When you no longer need an area of memory, you may free it by
calling another function, free, and return the area of memory to the
system.
Example:
request memory
return memory
Singly Linked Lists (1/15)
Linked lists are drawn as an order sequence of
nodes with links represented as arrows (Figure
4.1).
The name of the pointer to the first node in the list is the
name of the list. (the list of Figure 4.1 is called ptr.)
Notice that we do not explicitly put in the values of pointers, but
simply draw allows to indicate that they are there.
Singly Linked Lists (2/15)
The nodes do not resident in sequential locations
The locations of the nodes may change on
different runs
ptr
... Null
Link Field
Node
Data Field
chain
Singly Linked Lists (3/15)
Why it is easier to make arbitrary insertions and
deletions using a linked list?
To insert the word mat between cat can sat, we must:
Get a node that is currently unused; let its address be paddr.
Set the data field of this node to mat.
Set paddr’s link field to point to the address found in the link
field of the node containing cat.
Set the link field of the node containing cat to point to paddr.
Singly Linked Lists (4/15)
Delete mat from the list:
We only need to find the element that immediately
precedes mat, which is cat, and set its link field to point
to mat’s link (Figure 4.3).
We have not moved any data, and although the link
field of mat still points to sat, mat is no longer in the
list.
Singly Linked Lists (5/15)
We need the following capabilities to make linked
representations possible:
Defining a node’s structure, that is, the fields it
contains. We use self-referential structures, discussed
in Section 2.2 to do this.
Create new nodes when we need them. (malloc)
Remove nodes that we no longer need. (free)
Singly Linked Lists (6/15)
2.2.4 Self-Referential Structures
One or more of its components is a pointer to itself.
b a t \0 NULL
ptr
ptr
10 20 NULL
node
50
temp
Singly Linked Lists (11/15)
Implement Insertion:
void insert(list_pointer *ptr, List_pointer node)
{
/* insert a new node with data = 50 into the list ptr after
node */
list_pointer temp;
temp=(list_pointer)malloc(sizeof(list_node));
if(IS_FULL(temp)){
fprintf(stderr, “The memory is full\n”);
exit(1);
}
temp 50
temp->data=50;
Singly Linked Lists (12/15)
if(*ptr){ //nonempty list
temp->link = node->link;
node->link = temp;
}
else{ //empty list
temp->link = NULL;
*ptr = temp;
} ptr
} 10 20 NULL
node
50
temp
Singly Linked Lists (13/15)
Deletion
Observation: delete node from the list
10 50 20 NULL
10 50 20 NULL
ptr
10 20 NULL
Singly Linked Lists (14/15)
Implement Deletion:
void delete(list_pointer *ptr, list_pointer trail, list_pointer
node)
{
/* delete node from the list, trail is the preceding node
ptr is the head of the list */
if(trail)
trail->link = node->link;
else
ptr trial node
*ptr = (*ptr)->link;
free(node);
10 50 20 NULL
}
Singly Linked Lists (15/15)
Print out a list (traverse a list)
Program 4.5: Printing a list
void print_list(list_pointer ptr)
{
printf(“The list contains: “);
for ( ; ptr; ptr = ptr->link)
printf(“%4d”, ptr->data);
printf(“\n”);
}
Dynamically Linked
Stacks and Queues (1/8)
When several stacks and queues coexisted, there
was no efficient way to represent them
sequentially.
Notice that direction of links for both stack and the queue
facilitate easy insertion and deletion of nodes.
Easily add or delete a node form the top of the stack.
Easily add a node to the rear of the queue and add or delete a
node at the front of a queue.
Dynamically Linked
Stacks and Queues (2/8)
Represent n stacks Stack
link
..
.
NULL
Dynamically Linked
Stacks and Queues (3/8)
Push in the linked stack
void add(stack_pointer *top, element item){
/* add an element to the top of the stack */ Push
stack_pointer temp = (stack_pointer) malloc (sizeof
(stack));
if (IS_FULL(temp)) {
fprintf(stderr, “ The memory is full\n”);
exit(1); temp item link
}
top link
temp->item = item;
temp->link = *top; link
*top= temp;
..
.
} NULL
Dynamically Linked
Stacks and Queues (4/8)
Pop from the linked stack
element delete(stack_pointer *top) {
/* delete an element from the stack */ Pop
stack_pointer temp = *top;
element item;
if (IS_EMPTY(temp)) {
temp
fprintf(stderr, “The stack is empty\n”);
top item link
exit(1);
} link
item = temp->item;
link
*top = temp->link;
..
.
free(temp);
return item; NULL
}
Dynamically Linked
Stacks and Queues (5/8)
Represent n queues
Queue
front item link
Delete from
link
..
.
Add to
rear NULL
Dynamically Linked
Stacks and Queues (6/8)
enqueue in the linked queue
front link
link
..
.
rear NULL
temp
front item link
link
link
..
.
rear NULL
Dynamically Linked
Stacks and Queues (8/8)
The solution presented above to the n-stack, m-
queue problem is both computationally and
conceptually simple.
We no longer need to shift stacks or queues to make
space.
Computation can proceed as long as there is memory
available.
Polynomials (1/9)
Representing Polynomials As Singly Linked Lists
The manipulation of symbolic polynomials, has a classic example
of list processing.
In general, we want to represent the polynomial:
A( x ) am1 x em 1 a0 x e0
Where the ai are nonzero coefficients and the ei are
nonnegative integer exponents such that
em-1 > em-2 > … > e1 > e0 ≧ 0 .
We will represent each term as a node containing coefficient and
exponent fields, as well as a pointer to the next term.
Polynomials (2/9)
Assuming that the coefficients are integers, the type
declarations are:
typedef struct poly_node *poly_pointer;
typedef struct poly_node {
int coef;
int expon; a 3 x14 2 x 8 1
poly_pointer link;
};
poly_pointer a,b,d;
b 8 x 14 3x 10 10 x 6
Draw poly_nodes as:
coef expon link
Polynomials (3/9)
Adding Polynomials
To add two polynomials,we examine their terms
starting at the nodes pointed to by a and b.
If the exponents of the two terms are equal
1. add the two coefficients
2. create a new term for the result.
If the exponent of the current term in a is less than b
1. create a duplicate term of b
2. attach this term to the result, called d
3. advance the pointer to the next term in b.
We take a similar action on a if a->expon > b->expon.
Figure 4.12 generating the first three term of
d = a+b (next page)
Polynomials
(4/9)
Polynomials
(5/9)
Add two
polynomials
Polynomials (6/9)
Attach a node to the end of a list
void attach(float coefficient, int exponent, poly_pointer *ptr){
/* create a new node with coef = coefficient and expon = exponent,
attach it to the node pointed to by ptr. Ptr is updated to point to
this new node */
poly_pointer temp;
temp = (poly_pointer) malloc(sizeof(poly_node));
/* create new node */
if (IS_FULL(temp)) {
fprintf(stderr, “The memory is full\n”);
exit(1);
}
temp->coef = coefficient; /* copy item to the new node */
temp->expon = exponent;
(*ptr)->link = temp; /* attach */
*ptr = temp; /* move ptr to the end of the list */
}
Polynomials (7/9)
Analysis of padd
A( x )( am1 x em 1 a0 x e0 ) B( x )( bn 1 x f n 1 b0 x f 0 )
1. coefficient additions
0 additions min(m, n)
where m (n) denotes the number of terms in A (B).
2. exponent comparisons
extreme case:
em-1 > fm-1 > em-2 > fm-2 > … > e1 > f1 > e0 > f0
m+n-1 comparisons
3. creation of new nodes
extreme case: maximum number of terms in d is m+n
m + n new nodes
summary: O(m+n)
Polynomials (8/9)
A Suite for Polynomials
e(x) = a(x) * b(x) + d(x) read_poly()
poly_pointer a, b, d, e; print_poly()
... padd()
a = read_poly(); psub()
b = read_poly();
pmult()
d = read_poly();
temp = pmult(a, b);
temp is used to hold a partial result.
e = padd(temp, d); By returning the nodes of temp, we
print_poly(e); may use it to hold other polynomials
Polynomials (9/9)
Erase Polynomials
erase frees the nodes in temp
void erase (poly_pointer *ptr){
/* erase the polynomial pointed to by ptr */
poly_pointer temp;
while ( *ptr){
temp = *ptr;
*ptr = (*ptr) -> link;
free(temp);
}
}
Chain (1/3)
Chain:
A singly linked list in which the last node has a null link
Operations for chains
Inverting a chain
For a list of length ≧1 nodes, the while loop is executed
length times and so the computing time is linear or O(length).
... NULL
lead
invert
× ...
NULL
lead
Chain
(2/3)
temp
NULL NULL
ptr1 ptr2
Circularly Linked Lists (1/10)
Circular Linked list
The link field of the last node points to the first
node in the list.
Example
Represent a polynomial ptr = 3x14+2x8+1 as a
circularly linked list.
ptr 3 14 2 8 1 0
Circularly Linked Lists (2/10)
Maintain an Available List
We free nodes that are no longer in use so that we
may reuse these nodes later
We can obtain an efficient erase algorithm for circular
lists, by maintaining our own list (as a chain) of nodes
that have been “freed”.
Instead of using malloc and free, we now use
get_node (program 4.13) and ret_node (program
4.14).
ptr
temp
avail NULL
Why ?
So !
Circularly Linked
Lists (7/10)
For fit the
circular list with /* head node */
head node
representation
We may remove
the test for (*ptr) /*a->expon=-1, so b->expont > -1 */
from cerase
Changes the
original padd to
cpadd
ptr x1 x2 x3
Answer:
move down the entire length of ptr.
Possible Solution:
x1 x2 x3 ptr
Circularly Linked Lists (9/10)
Insert a new
node at the front
of a circular list
To insert node
at the rear, we
only need to add
the additional
statement *ptr =
node to the else
clause of
insert_front
x1 x2 x3 ptr
node
Circularly Linked Lists (10/10)
Finding the length of a circular list
Equivalence Relations (1/6)
Reflexive Relation
for any polygon x, x ≡ x (e.g., x is electrically equivalent to itself)
Symmetric Relation
for any two polygons x and y, if x ≡ y, then y ≡ x.
Transitive Relation
for any three polygons x, y, and z, if x ≡ y and y ≡ z, then x ≡ z.
Definition:
A relation over a set, S, is said to be an equivalence relation
over S iff it is symmertric, reflexive, and transitive over S.
Example:
“equal to” relationship is an equivalence relation
Example: Equivalence Relations (2/6)
if we have 12 polygons numbered 0 through 11
0 4, 3 1, 6 10, 8 9, 7 4, 6 8, 3 5, 2 11, 11 0
we can partition the twelve polygons into the following
equivalence classes:
{0, 2, 4, 7, 11};{1, 3, 5};{6, 8, 9,10}
Two phases to determine equivalence
First phase: the equivalence pairs (i, j) are read in and
stored.
Second phase:
we begin at 0 and find all pairs of the form (0, j).
Continue until the entire equivalence class containing 0 has
been found, marked, and printed.
Next find another object not yet output, and repeat
the above process.
Equivalence Relation (3/6)
Program to find equivalence classes
void main(void){ #include <stdio.h>
short int out[MAX_SIZE]; #define MAX_SIZE 24
node_pointer seq[MAX_SIZE]; #define IS_FULL(ptr) (!(ptr))
node_pointer x, y, top; #define FALSE 0
int i, j, n; #define TRUE 1
printf(“Enter the size (<=%d) ”, MAX_SIZE);
scanf(“%d”, &n);
for(i=0; i<n; i++){ typedef struct node
/*initialize seq and out */ *node_pointer;
typedef struct node {
out[i] = TRUE; seq[i] = NULL;
} int data;
/* Phase 1 */ node_pointer link;
};
/* Phase 2 */
}
Equivalence Relations (4/6)
Phase 1: read in and store the equivalence pairs <i, j>
[0] 11 4 NULL
[1] 3 NULL
[2] 11 NULL
[3] 5 1 NULL
[4] 7 0 NULL
(1) (2)
[5] 3 NULL Insert x to the top of lists seq[i]
[6] 8 10 NULL
[7] 4 NULL
[8] 6 9 NULL
[9] 8 NULL
[10 6 NULL
Insert x to the top of lists seq[j]
]
[11 0 2 NULL
]
0 4, 3 1, 6 10, 8 9, 7 4, 6 8, 3 5, 2 11, 11 0
Equivalence Relations (5/6)
Phase 2:
begin at 0 and find all pairs of the form <0, j>, where 0
and j are in the same equivalence class
by transitivity, all pairs of the form <j, k> imply that k
in the same equivalence class as 0
continue this way until we have found, marked, and
printed the entire equivalent class containing 0
Equivalence Relations (6/6)
x
y
Phase 2
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
top
i= 1
0 j= 11
7
2
4
0 out
:
[0] 11 4 NULL
[1] 3 NULL
[2] 11 NULL
[3] 5 1 NULL
[4] 7 0 NULL
[5] 3 NULL
[6] 8 10 NULL
[7] 4 NULL
[8] 6 9 NULL
[9] 8 NULL
[10 6 NULL
]
[11 0 2 NULL
]
New class: 0 11 4 7 2
Doubly Linked Lists (1/4)
Singly linked lists pose problems because we
can move easily only in the direction of the links
... NULL
? ptr
Doubly linked list has at least three fields
left link field(llink), data field(item), right link field(rlink).
The necessary declarations:
typedef struct node *node_pointer;
typedef struct node{
node_pointer llink;
element item;
node_pointer rlink;
};
Doubly Linked Lists (2/4)
Sample
doubly linked circular with head node: (Figure 4.23)
Head node
node
New node
Doubly Linked Lists (4/4)
Delete node
Head node
B B
V
L
R
Binary Tree Traversals (5/9)
Postorder traversal (LRV) (recursive
version) output A B / C * D * E +
:
L
R
V
Binary Tree Traversals (6/9)
Iterative inorder traversal
we use a stack to simulate recursion
5 4 11
8 3 14
2 17
1
A B
/ *C D
* E
+
L
V
FIFO
ptr
Additional Binary Tree Operations (1/7)
Copying Binary Trees
we can modify the postorder traversal algorithm only
slightly to copy the binary tree
similar as
Program 5.3
Additional Binary Tree Operations (2/7)
Testing Equality
Binary trees are equivalent if they have the same
topology and the information in corresponding nodes
is identical V L R
postorder traversal
data
value X3
X1 X3
X2 X1
Additional Binary Tree Operations (6/7)
node structure
For the purpose of our evaluation algorithm, we
assume each node has four fields:
TRUE
FALSE
FALSE
TRUE T
TRUE
F
T TRUE FALSE FALSE
TRUE
FALSE F T TRUE
Threaded Binary Trees (1/10)
Threads
Do you find any drawback of the above tree?
Too many null pointers in current representation of
binary trees
n: number of nodes
number of non-null links: n-1
total links: 2n
null links: 2n-(n-1) = n+1
Solution: replace these null pointers with some useful
“threads”
Threaded Binary Trees (2/10)
Rules for constructing the threads
If ptr->left_child is null,
replace it with a pointer to the node that would be
visited before ptr in an inorder traversal
If ptr->right_child is null,
replace it with a pointer to the node that would be
visited after ptr in an inorder traversal
Threaded Binary Trees (3/10)
A Threaded Binary Tree
root A
t: true thread
f: false child
dangling
f B f C
D t E t F G
dangling
inorder traversal:
H I
H D I B E A F C G
Threaded Binary
Trees (4/10)
Inorder
Threaded Binary Trees (8/10)
Inorder traversal of a threaded binary tree
void tinorder(threaded_pointer tree){
/* traverse the threaded binary tree inorder */
threaded_pointer temp = tree;
output H D I B E A FC G
for (;;) { :
temp = insucc(temp);
if (temp==tree) tree
break;
printf(“%3c”,temp->data);
}
}
Time Complexity: O(n)
Threaded Binary Trees (9/10)
Inserting A Node Into A Threaded Binary Tree
Insert child as the right child of node parent
1. change parent->right_thread to FALSE
2. set child->left_thread and child->right_thread to
TRUE
3. set child->left_child to point to parent
4. set child->right_child to parent->right_child
5. change parent->right_child to point to child
Threaded Binary Trees (10/10)
Right insertion in a threaded binary tree
void insert_right(thread_pointer parent, threaded_pointer child){
/* insert child as the right child of parent in a threaded binary tree */
threaded_pointer temp; root
child->right_child = parent->right_child;
child->right_thread = parent->right_thread; A parent
child->left_child = parent; B
child->left_thread = TRUE;
parent->right_child = child; C X child
parent->right_thread = FALSE;
If(!child->right_thread){ A parent temp
temp = insucc(child); B child
temp->left_child = child; X
} C
}
D
First Case
Second Case E F
successor
Heaps (1/6)
The heap abstract data type
Definition: A max(min) tree is a tree in which the key
value in each node is no smaller (larger) than the key
values in its children. A max (min) heap is a complete
binary tree that is also a max (min) tree
Basic Operations:
creation of an empty heap
insertion of a new elemrnt into a heap
deletion of the largest element from the heap
Heaps (2/6)
The examples of max heaps and min heaps
Property: The root of max heap (min heap) contains
the largest (smallest) element
Heaps (3/6)
Abstract data type of Max Heap
Heaps (4/6)
Queue in Chapter 3: FIFO
Priority queues
Heaps are frequently used to implement priority queues
delete the element with highest (lowest) priority
insert the element with arbitrary priority
Heaps is the only way to implement priority queue
machine service:
amount of time
(min heap)
amount of payment
(max heap)
factory:
time tag
Heaps (5/6)
Insertion Into A Max Heap
Analysis of insert_max_heap
The complexity of the insertion function is O(log2 n)
insert 2
51
*n= 6
5
i= 1
6
7
3
[1
] 20
21
parent sink
[2 [3
] ] item upheap
15 20
52
[4 [5 [6 [7
] ] ]
]
14 10 2 5
Deletion from a max heap Heaps (6/6)
After deletion, the
heap is still a
complete binary tree
Analysis of
delete_max_heap
The complexity of the
insertion function
is O(log2 n)
parent = 41
2
*n= 5
4
child = 8
2 [1
4
<
] 15
20
[2 [3
] ]
15
14 2
[4 [5
item.key = 20
]
10 10
14
] temp.key = 10
Binary Search Trees (1/8)
Why do binary search trees need?
Heap is not suited for applications in which arbitrary
elements are to be deleted from the element list
a min (max) element is deleted O(log2n)
deletion of an arbitrary element O(n)
search for an arbitrary element O(n)
Definition of binary search tree:
Every element has a unique key
The keys in a nonempty left subtree (right subtree) are
smaller (larger) than the key in the root of subtree
The left and right subtrees are also binary search trees
Binary Search Trees (2/8)
Example: (b) and (c) are binary search trees
medium
smaller larger
Binary Search Trees
(3/8)
Search:Search(25) Search(76)
44
17 88
32 65 97
28 54 82
29 76
80
Binary Search Trees (4/8)
Searching a
binary search
tree
O(h)
Binary Search Trees (5/8)
Inserting into a binary search tree
An empty tree
Binary Search Trees (6/8)
Deletion from a binary search tree
Three cases should be considered
case 1. leaf delete
case 2.
one child delete and change the pointer to this child
case 3. two child either the smallest element in the right
subtree or the largest element in the left subtree
Binary Search Trees (7/8)
Height of a binary search tree
The height of a binary search tree with n elements
can become as large as n.
It can be shown that when insertions and deletions
are made at random, the height of the binary search
tree is O(log2n) on the average.
Search trees with a worst-case height of O(log2n) are
called balance search trees
Binary Search Trees (8/8)
Time Complexity
Searching, insertion, removal
O(h), where h is the height of the tree
Worst case - skewed binary tree
O(n), where n is the # of internal nodes
Prevent worst case
rebalancing scheme
AVL, 2-3, and Red-black tree
Selection Trees (1/6)
Problem:
suppose we have k order sequences, called runs, that
are to be merged into a single ordered sequence
Solution:
straightforward : k-1 comparison
selection tree : log2k+1
There are two kinds of selection trees:
winner trees and loser trees
Selection Trees (2/6)
Definition: (Winner tree)
a selection tree is the binary tree where each node
represents the smaller of its two children
root node is the smallest node in the tree
a winner is the record with smaller key
Rules:
tournament : between sibling nodes
put X in the parent node X tree
where X = winner or loser
Winner Tree Selection Trees (3/6)
sequential allocation Each node represents
scheme the smaller of its two
(complete children
binary tree)
sequence
ordered
Selection Trees (4/6)
Analysis of merging runs using winner trees
# of levels: log2K +1 restructure time: O(log2K)
merge time: O(nlog2K)
setup time: O(K)
merge time: O(nlog2K)
Slight modification: tree of loser
consider the parent node only (vs. sibling nodes)
Selection Trees (5/6)
After one record has been output
6
15
Selection Trees (6/6)
Tree of losers can be conducted by Winner tree
0
6 8
9 17
10 20 9 90
Forests (1/4)
Definition:
A forest is a set of n 0 disjoint trees
Transforming a forest into a binary tree
Definition: If T1,…,Tn is a forest of trees, then the
binary tree corresponding to this forest, denoted by
B(T1,…,Tn ):
is empty, if n = 0
has root equal to root(T1); has left subtree equal to
B(T11,T12,…,T1m); and has right subtree equal to
B(T2,T3,…,Tn)
where T11,T12,…,T1m are the subtrees of root (T1)
Forests (2/4)
Rotate the tree clockwise by 45 degrees
Leftmost child A
A E G
B E
B C D F H I C F G
Right sibling D H
I
Forest traversals
Forest preorder traversal
Forests (3/4)
(1)If F is empty, then return.
(2)Visit the root of the first tree of F.
(3)Traverse the subtrees of the first tree in tree preorder.
(4)Traverse the remaining tree of F in preorder.
Forest inorder traversal
(1)If F is empty, then return
(2)Traverse the subtrees of the first tree in tree inorder
(3)Visit the root of the first tree of F
(4)Traverse the remaining tree of F in inorder
Forest postorder traversal
(1)If F is empty, then return
(2)Traverse the subtrees of the first tree in tree postorder
(3)Traverse the remaining tree of F in postorder
(4)Visit the root of the first tree of F
preorder: A B C D E F G H I
inorder: B C A E D G H F I Forests (4/4)
A
B, C E, D, G, H, F, I
preorder: A B C (D E F G H I)
inorder: B C A (E D G H F I)
A
B D
C E F, G, H, I
Set Representation(1/13)
6 7 8 1 9 3 5
Si Sj =
Two operations considered here
Disjoint set union S1 S2={0,6,7,8,1,4,9}
Find(i): Find the set containing the element i.
3 S3, 8 S1
Set Representation(2/13)
Union and Find Operations
Make one of trees a subtree of the other
0 4
8 4 0 1 9
6 7
9 6 7 8
1
S3 1 9
3 5
int find1(int i) {
for(; parent[i] >= 0; i = parent[i]);
return i;
}
void union1(int i, int j) {
parent[i] = j;
}
Program 5.18: Initial attempt at union-find function (p.241)
Set Representation(5/13)
union operation n-1 union(0,1), find(0)
O(n) n-1 union(1,2), find(0)
.
find operation n-2 .
O(n2) n .
i
union(n-2,n-1),find(0)
i 2
degenerate tree
*Figure 5.43:Degenerate tree (p.242)
Set Representation(6/13)
weighting rule for union(i,j): if # of nodes in i < # in j then j the parent of
i
Set Representation(7/13)
Modified Union Operation
void union2(int i, int j)
{ Keep a count in the root of tree
int temp = parent[i] + parent[j];
if (parent[i] > parent[j]) {
parent[i] = j; /*make j the new root*/
parent[j] = temp;
}
else {
parent[j] = i;/* make i the new root*/
parent[i] = temp;
} If the number of nodes in tree i is
} less than the number in tree j, then
make j the parent of i; otherwise
make i the parent of j.
Set Representation(8/13)
Figure 5.45:Trees achieving worst case bound (p.245)
log28+1
Set Representation(9/13)
The definition of Ackermann’s function used
here is :
2 q P=0
0 q=0 and p >= 1
A ( p, q) =
0 P>=1 and p = 1
A( p 1, A( p, q 1)) p>=1 and q >= 2
Set Representation(10/13)
Modified Find(i) Operation
Int find2(int i) {
int root, trail, lead;
for (root=i;parent[root]>=0;oot=parent[root])
;
for (trail=i; trail!=root; trail=lead) {
lead = parent[trail];
parent[trail]= root;
}
return root; If j is a node on the path from
} i to its root then make j a child
of the root
Set Representation(11/13)
0 0
1 2 4 1 2 4 6 7
3 5 6 3 5
B, C D, E, F, G, H, I B D
A
C E F
B D, E, F, G, H, I G I
C H
Counting Binary trees(3/10)
Figure5.49(c) with the node numbering
of Figure 5.50.Its preorder permutation
is 1,2…,9, and its inorder 1
permutation is 2,3,1,5,4,
7,8,6,9. 2 4
3 5 6
7 9
8
Counting Binary trees(4/10)
If we start with the numbers1,2,3, then the
possible permutations obtainable by a stack are:
(1,2,3) (1,3,2) (2,1,3) (2,3,1) (3,2,1)
Obtaining(3,1,2) is impossible.
1 1
1 1 1
2 2
2 2 2 3
3 3 3 3
Counting Binary trees(5/10)
Matrix Multiplication
Suppose that we wish to compute the product of n matrices:
M1 * M2 . . .* Mn
Since matrix multiplication is associative, we can perform
these multiplications in any order. We would like to know
how many different ways we can perform these
multiplications . For example, If n =3, there are two
possibilities:
(M1*M2)*M3
M1*(M2*M3)
Counting Binary trees(6/10)
Let bn be the number of different ways to compute the
product of n matrices. Then b2 1, , b3 2 and b4 5 .
Let M ij , i j be the product M i * M i1 * ... * M j . .
The product we wish to compute is M ln by computing
any one of the products M 1i * M i 1,n ,1 i n.
The number of distinct ways to obtain M 1i andM i 1,n
are bi and bni ,respectively. Therefore, letting bi =1,
we have: n 1
bn b b
i 0
i n i ,n 1
Counting Binary trees(7/10)
Now instead let bn be the number of distinct binary
trees with n nodes. Again an expression for bn in
terms of n is what we want. Than we see that bn is the
sum of all the possible binary trees formed in the
following way: a root and two subtrees with bi and bn i 1
nodes, for 0 i n . This explanation says that
n 1
bn b b i n i 1 ,n 1 b0 1
i 0 and bn
bi bn-i-1
Counting Binary trees(8/10)
Number of Distinct Binary Trees:
To obtain number of distinct binary trees with n
nodes, we must solve the recurrence of Eq.(5.5).To
begin we let: B ( x ) i0 i
b x i
(5.6)
Which is the generating function for the number of
binary trees. Next observe that by the recurrence
relation we get the identity: xB 2 ( x ) b( x ) 1
Using the formula to solve quadratics and the fact
(Eq. (5.5)) that B(0) = b0 = 1 ,we get
1 1 4x
B( x )
2x
Counting Binary trees(9/10)
Number of Distinct Binary Trees:
We can use the binomial theorem to expand
1 4 x 1/ 2 to obtain:
n
1 1 / 2
B( x ) 1 ( 4 x )
2 n 0 n
1/ 2
( 1) m 2 m 1 m
2 x
m 0 m 1
(5.7)
Counting Binary trees(10/10)
Number of Distinct Binary Trees:
Comparing Eqs.(5.6) and (5.7) we see that bn ,
which is the coeffcient of x n in B(x), is : 1 / 2
n 1 1n 2 2n1
Some simplification yields the more compact form
1 2n
bn
n 1 n
n
which is approximately bn O ( 4 3/ 2 )
n
Chapter 7 Sorting: Outline
Introduction
Searching and List Verification
Definitions
Insertion Sort
Quick Sort
Merge Sort
Heap Sort
Counting Sort
Radix Sort
Shell Sort
Summary of Internal Sorting
Introduction (1/9)
Why efficient sorting methods are so important ?
The efficiency of a searching strategy depends
on the assumptions we make about the
arrangement of records in the list
No single sorting technique is the “best” for all
initial orderings and sizes of the list being sorted.
We examine several techniques, indicating when
one is superior to the others.
Introduction (2/9)
Sequential search
We search the list by examining the key
values list[0].key, … , list[n-1].key.
Example: List has n records.
4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95
In that order, until the correct record is
located, or we have examined all the records
in the list
Unsuccessful search: n+1 O(n)
Average successful search n 1
(i 1) / n (n 1) / 2 O(n)
i 0
Introduction (3/9)
Binary search
Binary search assumes that the list is ordered on the key
field such that list[0].key list[1]. key … list[n-1]. key.
This search begins by comparing searchnum (search
key) and list[middle].key where
[5]
middle=(n-1)/2
46
[2] [8]
4, 15, 17, 26, 30, 46,
48, 56, 58, 82, 90, 95 17 58
[0] [3] [6] [10]
4 26 48 90
Figure 7.1: [1] [4] [7] [9] [11]
Decision tree for
binary search (p.323) 15 30 56 82 95
Introduction (4/9)
Binary search (cont’d)
Analysis of binsearch: makes no more than O(log
n) comparisons
Introduction (5/9)
List Verification
Compare lists to verify that they are identical or
identify the discrepancies.
Example
International Revenue Service (IRS)
(e.g., employee vs. employer)
Reports three types of errors:
all records found in list1 but not in list2
all records found in list2 but not in list1
all records that are in list1 and list2 with the same key
but have different values for different fields
Introduction (6/9)
Verifying using a sequential search
Check whether
the elements in
list1 are also in
list2
The remainder
elements of a list
is not a member of
another list
Complexities Introduction (8/9)
Assume the two lists are randomly arranged
Verify1: O(mn)
Verify2: sorts them before verification
O(tsort(n) + tsort(m) + m + n) O(max[nlogn,
mlogm])
tsort(n): the time needed to sort the n records in list1
tsort(m): the time needed to sort the m records in list2
we will show it is possible to sort n records in O(nlogn) time
Definition
Given (R0, R1, …, Rn-1), each Ri has a key value Ki
find a permutation , such that K(i-1) K(i), 0<i n-1
denotes an unique permutation
Sorted: K(i-1) K(i), 0<i<n-1
Stable: if i < j and Ki = Kj then Ri precedes Rj in the
Introduction (9/9)
Two important applications of sorting:
An aid to search
Matching entries in lists
Internal sort
The list is small enough to sort entirely in main
memory
External sort
There is too much information to fit into main
memory
Insertion Sort (1/3)
Concept:
The basic step in this method is to insert a record R
into a sequence of ordered records, R1, R2, …, Ri (K1
K2 , …, Ki) in such a way that the resulting
sequence of size i is also ordered
Variation
Binary insertion sort
reduce search time
List insertion sort
reduce insert time
Insertion Sort (2/3)
Insertion sort program list [0 [1 [2 [3 [4
]1
5 ]2
2 5 ]3
3 5 ]4
1 ]5
5 4
i= 3
1
4
2 next = 4
3
1
2
Insertion Sort (3/3)
Analysis of InsertionSort:
If k is the number of records LOO, then the computing
time is O((k+1)n)
Ri is LOO iff Ri max{R j }
The worst-case time is O(n ).
2 0 j i
O(n)
n 2
O( i ) O( n 2 )
j 0
Quick Sort (1/6)
The quick sort scheme developed by C. A. R.
Hoare has the best average behavior among all
the sorting methods we shall be studying
Given (R0, R1, …, Rn-1) and Ki denote a pivot key
If Ki is placed in position s(i),
then Kj Ks(i) for j < s(i), Kj Ks(i) for j > s(i).
After a positioning has been made, the original
file is partitioned into two subfiles, {R0, …, Rs(i)-1},
Rs(i), {Rs(i)+1, …, Rs(n-1)}, and they will be sorted
independently
Quick Sort (2/6)
Quick Sort Concept
select a pivot key
interchange the elements to their correct positions
according to the pivot
the original file is partitioned into two subfiles and they
will be sorted R0 R1 R 2 R 3 R 4 R 5 R 6 R7 R 8 R9
independently 26 5 37 1 61 11 59 15 48 19
11 5 19 1 15 26 59 61 48 37
1 5 11 19 15 26 59 61 48 37
1 5 11 19 15 26 59 61 48 37
1 5 11 15 19 26 59 61 48 37
1 5 11 15 19 26 48 37 59 61
1 5 11 15 19 26 37 48 59 61
1 5 11 15 19 26 37 48 59 61
In-Place Partitioning Example
a 6 2 8 5 11 10 4 1 9 7 3
a 6 2 3 5 11 10 4 1 9 7 8
a 6 2 3 5 1 10 4 11 9 7 8
a 6 2 3 5 1 4 10
10 11 9 7 8
a 3 2 8 5 11 10
10 4 1 9 7 66
a 33 2 8 5 11 66 4 1 9 7 110
0
a 33 2 8 5 11 7 4 1 9 6 110
0
a 33 2 1 5 11 7 4 8 9 6 110
0
a 33 2 1 5 4 7 11 8 9 6 110
0
a 33 2 1 5 4 66 11 8 9 7 110
0
Merge Sort (1/7)
Before looking at the merge sort algorithm to sort
n records, let us see how one may merge two
sorted lists to get a single sorted list.
Merging
Uses O(n) additional space.
It merges the sorted lists
(list[i], … , list[m]) and (list[m+1], …, list[n]),
into a single sorted list, (sorted[i], … , sorted[n]).
Copy sorted[1..n] to list[1..n]
Merge (using O(n) space)
Merge Sort (3/7)
Recursive merge sort concept
Merge Sort (4/7)
Recursive merge sort concept
Merge Sort (5/7)
Recursive merge sort concept
Merge Sort (6/7)
Recursive merge sort concept
Merge Sort (7/7)
Recursive merge sort concept
Heap Sort (1/3)
The challenges of merge sort
The merge sort requires additional storage
proportional to the number of records in the file
being sorted.
Heap sort
Require only a fixed amount of additional storage
Slightly slower than merge sort using O(n)
additional space
The worst case and average computing time is
O(n log n), same as merge sort
Unstable
adjust
adjust the binary tree to establish the heap
root = 1
n = 10
rootkey = 26
child = 14
3
2
7
6
[2] 5 [3] 59
77
bottom-up
ascending order
(max heap)
top-down [1] 19
126
61
48
77
59
26
15
11
5
1
51
[2] 19
15
61
5
48 77
[3] 11
59
26
1
48
15 61
1 [4][5] 19
5 11
26 59
1 [6][7] 48
26
1
15
5 [9] 48
[8] 59 161 19
5
77 [10]
Counting Sort
For key values within small range
1. scan list[1..n] to count the frequency of
every value
2. sum to find proper index to put value x
3. scan list[1..n] and put to sorted[]
4. copy sorted to list
O(n) for time and space
Radix Sort (1/5)
We considers the problem of sorting records that
have several keys
These keys are labeled K0 (most significant key), K1,
… , Kr-1 (least significant key).
Let Ki j denote key Kj of record Ri.
A list of records R0, … , Rn-1, is lexically sorted with
respect to the keys K0, K1, … , Kr-1 iff
(Ki0, Ki1, …, Kir-1) (K0i+1, K1i+1, …, Kr-1i+1), 0 i < n-1
Radix Sort (2/5)
Example
sorting a deck of cards on two keys, suit and face
value, in which the keys have the ordering relation:
K0 [Suit]: <<<
K1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A
Thus, a sorted deck of cards has the ordering:
2, …, A, … , 2, … , A
Two approaches to sort:
1. MSD (Most Significant Digit) first: sort on K0, then K1, ...
2. LSD (Least Significant Digit) first: sort on Kr-1, then Kr-2, ...
Radix Sort (3/5)
MSD first
1. MSD sort first, e.g., bin sort, four bins
2. LSD sort second
Result: 2, …, A, … , 2, … , A
LSD first Radix Sort (4/5)
1.LSD sort first, e.g., face sort,
13 bins 2, 3, 4, …, 10, J, Q, K, A
2.MSD sort second (may not needed, we can just classify
these 13 piles into 4 separated piles by considering them
from face 2 to face A)
Simpler than the MSD one because we do not have to
sort the subpiles independently
Result:
2, …, A, … ,
2, …, A
Radix Sort (5/5)
We also can use an LSD or MSD sort when we
have only one logical key, if we interpret this key
as a composite of several keys.
Example:
integer: the digit in the far right position is the least
significant and the most significant for the far left
position
range: 0 K 999 MSD LSD
0-9 0-9 0-9
using LSD or MSD sort for three keys (K0, K1, K2)
since an LSD sort does not require the maintainence
of independent subpiles, it is easier to implement
Shell Sort
For (h = magic1; h > 0; h /= magic2)
Insertion sort elements with distance h
Idea: let data has chance to “long jump”
Insertion sort is very fast for partially
sorted array
The problem is how to find good magic?
Several sets have been discussed
Remember 3n+1
Summary of Internal Sorting (1/2)
Insertion Sort
Works well when the list is already partially ordered
The best sorting method for small n
Merge Sort
The best/worst case (O(nlogn))
Require more storage than a heap sort
Slightly more overhead than quick sort
Quick Sort
The best average behavior
The worst complexity in worst case (O(n2))
Radix Sort
Depend on the size of the keys and the choice of the radix
Summary of Internal Sorting (2/2)
Analysis of the average running times
CS235102
Data Structures
Chapter 8 Hashing
Chapter 8 Hashing: Outline
The Symbol Table Abstract Data Type
Static Hashing
Hash Tables
Hashing Functions
Mid-square
Division
Folding
Digit Analysis
Overflow Handling
Linear Open Addressing, Quadratic probing, Rehashing
Chaining
The Symbol Table ADT (1/3)
Many example of dictionaries are found in many
applications, Ex. spelling checker
In computer science, we generally use the term
symbol table rather than dictionary, when
referring to the ADT.
We define the symbol table as a set of name-
attribute pairs.
Example: In a symbol table for a compiler
the name is an identifier
the attributes might include an initial value
a list of lines that use the identifier.
The Symbol Table ADT (2/3)
Operations on symbol table:
Determine if a particular name is in the table
Retrieve/modify the attributes of that name
Insert/delete a name and its attributes
Implementations
Binary search tree: the complexity is O(n)
Some other binary trees (chapter 10): O(log n).
Hashing
A technique for search, insert, and delete operations
that has very good expected performance.
The Symbol Table ADT (3/3)
Search Techniques
Search tree methods
Identifier comparisons
Hashing methods
Relies on a formula called the hash function.
Types of hashing
Static hashing
Dynamic hashing
Hash Tables (1/6)
In static hashing, we store the identifiers in a
fixed size table called a hash table
Arithmetic function, f
To determine the address of an identifier, x, in the
table
f(x) gives the hash, or home address, of x in the table
Hash table, ht
Stored in sequential memory locations that are
partitioned into b buckets, ht[0], …, ht[b-1].
Each bucket has s slots
Hash Tables (2/6)
hash table (ht) f(x): 0 … (b-1)
0
1
buckets
b
2
. . .
.
. . ... ... ... .
. . . .
b-2
b-1
1 2 ………. s
s slots
Hash Tables (3/6)
The identifier density of a hash table
is the ratio n/T
n is the number of identifiers in the table
T is possible identifiers
The loading density or loading factor
of a hash table is = n/(sb)
s is the number of slots
b is the number of buckets
Hash Tables (4/6)
Two identifiers, i1 and i2 are synonyms with
respect to f if f(i1) = f(i2)
We enter distinct synonyms into the same bucket as
long as the bucket has slots available
An overflow occurs when we hash a new
identifier into a full bucket
A collision occurs when we hash two
non-identical identifiers into the same bucket.
When the bucket size is 1, collisions and
overflows occur simultaneously.
Hash Tables (5/6)
Example 8.1: Hash table
b = 26 buckets and s = 2 slots. Distinct identifiers n = 10
The loading factor, , is 10/52 = 0.19.
Associate the letters, a-z,
with the numbers, 0-25,
Synonyms
respectively
Define a fairly simple hash Synonyms
function, f(x), as the
first character of x.
C library functions (f(x)): Synonyms
acos(0), define(3), float(5), exp(4),
char(2), atan(0), ceil(2), floor(5),
clock(2), ctime(2)
overflow: clock, ctime
Hash Tables (6/6)
The time required to enter, delete, or search for
identifiers does not depend on the number of
identifiers n in use; it is O(1).
Hash function requirements:
Easy to compute and produces few collisions.
Unfortunately, since the ration b/T is usually small, we
cannot avoid collisions altogether.
=> Overload handling mechanisms are needed
Hashing Functions (1/8)
A hash function, f, transforms an identifier, x,
into a bucket address in the hash table.
We want a hash function that is easy to compute
and that minimizes the number of collisions.
Hashing functions should be unbiased.
That is, if we randomly choose an identifier, x, from
the identifier space, the probability that f(x) = i is 1/b
for all buckets i.
We call a hash function that satisfies unbiased
property a uniform hash function.
Mid-square, Division, Folding, Digit Analysis
Hashing Functions (2/8)
Mid-square fm(x)=middle(x2):
Frequently used in symbol table applications.
We compute fm by squaring the identifier and then
using an appropriate number of bits from the middle
of the square to obtain the bucket address.
The number of bits used to obtain the bucket address
depends on the table size. If we use r bits, the range
of the value is 2r.
Since the middle bits of the square usually depend
upon all the characters in an identifier, there is high
probability that different identifiers will produce
different hash addresses.
Hashing Functions (3/8)
Division fD(x) = x % M :
Using the modulus (%) operator.
We divide the identifier x by some number M and use
the remainder as the hash address for x.
This gives bucket addresses that range from 0 to M - 1,
where M = that table size.
The choice of M is critical.
If M is divisible by 2, then odd keys to odd buckets
and even keys to even buckets. (biased!!)
Hashing Functions (4/8)
The choice of M is critical (cont’d)
When many identifiers are permutations of each other, a biased use
of the table results.
Example: X=x1x2 and Y=x2x1
Internal binary representation: x1 --> C(x1) and x2 --> C(x2)
Each character is represented by six bits
X: C(x1) * 26 + C(x2), Y: C(x2) * 26 + C(x1)
(fD(X) - fD(Y)) % p (where p is a prime number)
= (C(x1) * 26 % p + C(x2) % p - C(x2) * 26 % p - C(x1) % p ) % p
p = 3, 26=64
(64 % 3 * C(x1) % 3 + C(x2) % 3 - 64 % 3 * C(x2) % 3 - C(x1) % 3) % 3
= C(x1) % 3 + C(x2) % 3 - C(x2) % 3 - C(x1) % 3 = 0 % 3
The same behavior can be expected when p = 7
A good choice for M would be : M a prime number such that M does
not divide rka for small k and a.
Hashing Functions (5/8)
Folding
Partition identifier x into several parts
All parts except for the last one have the same length
Add the parts together to obtain the hash address
Two possibilities (divide x into several parts)
Shift folding:
Shift all parts except for the last one, so that the least
significant bit of each part lines up with corresponding
bit of the last part.
x1=123, x2=203, x3=241, x4=112, x5=20, address=699
Folding at the boundaries:
reverses every other partition before adding
x1=123, x2=302, x3=241, x4=211, x5=20, address=897
Hashing Functions (6/8)
Folding example:
123 203 241 112 20
P1 P2 P3 P4 P5
shift folding 123
203
241
112
20
699
# of key comparisons=21/11=1.91
Comparison: Overflow Handling (8/8)
In Figure 8.7, The values in each column give the average
number of bucket accesses made in searching eight
different table with 33,575, 24,050, 4909, 3072, 2241,
930, 762, and 500 identifiers each.
Chaining performs better than linear open addressing.
We can see that division is generally superior
item = 80
i=313
grandparent = 0
3
[1 7
5 min
]
[2 70 [3 40
80 max
] ]
[4 [5 [6 [7
] 30 ] 9 ] 10 ]
7 15 min
45 50 30
20 12 10 40 max
[8 [9 [10 [11 [12 [13 [14
MIN-MAX Heaps (6/10)
Deletion of min element
If we wish to delete the element with the smallest key,
then this element is in the root.
In general situation, we are to reinsert an element item
into a min-max-heap, heap, whose root is empty.
We consider the two cases:
1. The root has no children
Item is to be inserted into the root.
2. The root has at least one child.
The smallest key in the min-max-heap is in one of the children
or grandchildren of the root. We determine the node k has the
smallest key.
The following possibilities need to be considered:
MIN-MAX Heaps (7/10)
a) item.key heap[k].key
No element in heap with key smaller than item.key
Item may be inserted into the root.
b) item.key heap[k].key, k is a child of the root
Since k is a max node, it has no descendants with key
larger than heap[k].key. Hence, node k has no
descendants with key larger than item.key.
heap[k] may be
moved to the
root and item
inserted into
node k.
MIN-MAX Heaps
a) item.key heap[k].key, (8/10)
k is a grandchild of the root
In this case, heap[k] may be moved to the root, now
heap[k] is seen as presently empty.
Let parent be the parent of k.
If item.key heap[parent].key, then interchange them.
This ensures that the max node parent contains the
largest key in the sub-heap with root parent.
At this point, we are faced with the problem of inserting
item into the
sub-heap with
root k.
Therefore, we
repeat the above
process.
delete_min: complexity: O(log n)
Delete the minimum element from the min-max heap
*n = 12
11
i=5 1
last = 5
k = 11
5
parent = 2
temp.key =
x.key = 12
[1 79 [0 7
] ]
[2 70 [3 40
] ]
[4 [5 [6 [7
] 30 ] 912 ] 10 ] 15
45 50 30
20 12
[8 [9 [10 [11 [12
MIN-MAX Heaps (10/10)
Deletion of max element
1. Determine the children of the root which are located on
max-level, and find the larger one (node) which is the
largest one on the min-max heap
2. We would consider the node as the root of a max-min
heap
3. There exist a max-min heap
similar
approach
(deletion of
max element)
as we
mentioned
above
Deaps(1/8)
Definition
The root contains no element
The left subtree is a min-heap
The right subtree is a max-heap
Constraint between the two trees:
let i be any node in left subtree, j be the
corresponding node in the right subtree.
if j not exists, let j corresponds to parent of i
i.key <= j.key
Deaps(2/8)
log 2 n 1
i = min_partner(n) = n2
log 2 n 1
j = max_partner(n) = n2
if j > heapsize j /= 2
Deaps Insert(3/8)
public void insert(int x) { } else {
int i; i = maxPartner(n);
if (++n == 2) { if (x > deap[i]) {
deap[2] = x; return; } deap[n] = deap[i];
if (inMaxHeap(n)) { maxInsert(i, x);
i = minPartner(n); } else minInsert(n, x);
if (x < deap[i]) { }
deap[n] = deap[i]; }
minInsert(i, x);
} else maxInsert(n, x);
Deaps(4/8)
Insertion Into A Deap
Deaps(5/8)
Deaps(6/8)
Deaps delete min(7/8)
public int deleteMin() { // try to put x at leaf i
int i, j, key = deap[2], x = j = maxPartner(i);
deap[n--]; if (x > deap[j]) {
// move smaller child to i deap[i] = deap[j];
for (i = 2; 2*i <= n; deap[i] maxInsert(j, x);
= deap[j], i = j) { } else {
j = i * 2;
minInsert(i, x);
if (j+1 <= n && (deap[j]
}
> deap[j+1]) j++;
return key;
}
}
Deaps(8/8)
Leftist Trees(1/7)
Support combine (two trees to one)
Leftist Trees(2/7)
shortest(x) = 0 if x is an external node, otherwise
1+min(shortest(left(x)),shortest(right(x))}
Leftist Trees(3/7)
Definition: shortest(left(x)) >= shortest(right(x))
Leftist Trees(4/7)
Algorithm for combine(a, b)
assume a.data <= b.data
if (a.right is null) then make b be right child of
a
else combine (a.right, b)
if shortest (a.right) > shortest (a.left) then
exchange
Leftist Trees(5/7)
Leftist Trees(6/7)
Leftist Trees(7/7)
Binomial Heaps(1/10)
Cost Amortization( 分期還款 )
every operation in leftist trees costs O(logn)
actual cost of delete in Binomial Heap could be O(n),
but insert and combine are O(1)
cost amortization charge some cost of a heavy
operation to lightweight operations
amortized Binomial Heap delete is O(log2n)
A tighter bound could be achieved for a sequence of
operations
actual cost of any sequence of i inserts, c combines,
and dm delete in Binomial Heaps is O(i+c+dmlogi)
Binomial Heaps(2/10)
Definition of Binomial Heap
Node: degree, child ,left_link, right_link, data, parent
roots are doubly linked
a points to smallest root
Binomial Heaps(3/10)
Binomial Heaps(4/10)
Insertion Into A Binomial Heaps
make a new node into doubly linked circular
list pointed at by a
set a to the root with smallest key
Combine two B-heaps a and b
combine two doubly linked circular lists to one
set a to the root with smallest key
Binomial Heaps(5/10)
Deletion Of Min Element
Binomial Heaps(6/10)
Binomial Heaps(7/10)
Binomial Heaps(8/10)
Binomial Heaps(9/10)
Binomial Heaps(10/10)
Trees in B-Heaps is Binomial tree
B0 has exactly one node
Bk, k > 0, consists of a root with degree k
and whose subtrees are B0, B1, …, Bk-1
Bk has exactly 2k nodes
actual cost of a delete is O(logn + s)
s = number of min-trees in a (original roots -
1) and y (children of the removed node)
Fibonacci Heaps(1/8)
Definition
delete, delete the element in a specified node
decrease key
This two operations are followed by cascading
cut
Fibonacci Heaps(2/8)
Deletion From An F-heap
min or not min
Fibonacci Heaps(3/8)
Decrease Key
if not min, and smaller than parent, then delete
Fibonacci Heap(4/8)
To prevent the amortized cost of delete
min becomes O(n), each node can have
only one child be deleted.
If two children of x were deleted, then x
must be cut and moved to the ring of
roots.
Using a flag (true of false) to indicate
whether one of x’s child has been cut
Fibonacci Heaps(5/8)
Cascading Cut
Fibonacci Heaps(6/8)
Lemma
the ith child of any node x in a F-Heap has a
degree of at least i – 2, except when i=1 the
degree is 0
Corollary
Let Sk be the minimum possible number of
descendants of a node of degree k, then S0=1,
S1=2. From the lemma above, we got
k 2
(2 comes from 1st child
Sk Si 2 and root)
i 0
Fibonacci Heaps(7/8)
k
Fk 2 Fi 2
i2
S k Fk 2
That’s why the data structure is called
Fibonacci Heap
Fibonacci Heaps(8/8)
Application Of F-heaps
CS235102
Data Structures
Chapter 10 Search Structures
Search Structures: Outline
Optimal Binary Search Trees
AVL Trees
2-3 Trees
2-3-4 Trees
Red Black Trees
B-Trees
Optimal binary search trees (1/14)
In this section we look at the construction of
binary search trees for a static set of identifiers
Make no additions to or deletions from the
Only perform searches
We examine the correspondence between a
binary search tree and the binary search
function
Optimal binary search trees (2/14)
Examine: A binary search on the list (do, if , while)
is equivalent to
using the function
(search2) on the
binary search tree
Optimal binary search trees (3/14)
For a given static list, to decide a cost measure
for search tree in order to find an optimal binary
search tree
Assume that we wish to search for an identifier at
level k of a binary search tree.
Generally, the number of iteration of binary search
equals the level number of the identifier we seek.
It is reasonable to use the level number of a node as
its cost.
A full binary tree may not be an optimal binary
search tree if the identifiers are
1
searched for with different frequency
Consider these 2 2
4 4
Optimal binary search trees (7/14)
The maximum and minimum possible values for I
with n internal nodes
Maximum:
The worst case occurs when the tree is skewed, that
is, the tree has a depth of n.
Minimum:
We must have as many internal nodes as close to the
root as possible in order to obtain trees with minimal I
One tree with minimal internal path length is the
complete binary tree that the distance of node i from
the root is log2i.
Optimal binary search trees (8/14)
In the binary search tree:
The identifiers a1, a2, …, an with a1 < a2 < … < an
The probability of searching for each ai is pi
The total cost (when only successful searches are
made) is:
1 3
Computation is carried out row-wise
from row 0 to row 4
4