Day3.1 DataStructures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 182

Data Structures

11/08/2022 1
Array and Linked Lists

2
Overview and Reading
• Reading: Chapters: 3.1, 3.2, and 3.3

• Basic Elementary Data Structures

• Array
• Linked Lists
– Singly linked lists
– Doubly linked lists
– Circular linked lists
• These are used for more advanced data structures later

3
Array (§ 3.1)
Memory

a b c d

start

• Storing data in a sequential memory locations


• Access each element using integer index
• Very basic, popular, and simple
• int a[10]; int *a = new int(10);
4
Array: Problems
• New insertion and deletion: difficult
– Need to shift to make space for insertion
– Need to fill empty positions after deletion

• Why don’t we connect all elements just


“logically” not “physically”?
– Linked List

5
Singly Linked List (§ 3.2)
• A singly linked list is a
concrete data structure next
consisting of a sequence of
nodes
• Each node stores
– element elem node
– link to the next node

6
Example: Linked list of strings

7
Singly Linked List of Strings: Picture
StringLinkedList
*head

StringNode StringNode StringNode StringNode StringNode

next next next next NULL

“yy” “bb” “cc” “dd” “ee”

8
Inserting at the Head

1. Allocate a new node


2. Insert a new element
3. Have the new node point to the old head
4. Update head to point to new node
9
Removing at the Head

1. Update head to point to next node in the list


2. Allow garbage collector to reclaim the former first node
(typically done by calling “delete” in C++)
10
Let’s make codes

11
Inserting and Removing
at the Tail
1. Allocate a new node
2. Insert new element
Insertion at the tail
3. Have new node point to null
4. Have old last node point to new node
5. Update tail to point to new node

1. …
2. ... Removal at the tail
3. …
4. …

12
”Generic” Singly Linked Lists: Template

See the implementation code of member functions


in the text (page 122)
13
Doubly Linked List (§ 3.3)
• Singly Linked List
– Not easy to remove an elem.
at the tail (or any other node)
prev next
• Trailer: Dummy sentinel
• Previous link
elem node

header nodes trailer

elements
14
C++ Implementation: Class Design

15
Constructor and Destructor (Don’t
forget!)

16
Circular Linked List (§ 3.3)
• A kind of Singly Linked List
• Rather than having a head or a tail, it forms a cycle
• Cursor
– A virtual starting node
– This can be varying as we perform operations

17
C++ Implementation

What is advance()?

18
Stacks

11/08/2022 19
Abstract Data Types (ADTs)
An abstract data • Example: ADT modeling a
type (ADT) is an simple stock trading system
abstraction of a The data stored are buy/sell orders
data structure
The operations supported are
An ADT specifies: order buy(stock, shares, price)
◼ Data stored order sell(stock, shares, price)
◼ Operations on the void cancel(order)
data Error conditions:
◼ Error conditions Buy/sell a nonexistent stock
associated with Cancel a nonexistent order
operations

11/08/2022 20
The Stack ADT
The Stack ADT stores • Auxiliary stack operations:
arbitrary objects object top(): returns the last
Insertions and deletions inserted element without
follow the last-in first-out removing it
scheme integer size(): returns the
Think of a spring-loaded number of elements stored
plate dispenser boolean isEmpty(): indicates
whether no elements are
Main stack operations: stored
◼ push(object): inserts an
element
◼ object pop(): removes and
returns the last inserted
element

11/08/2022 21
Exceptions
Attempting the • In the Stack ADT,
execution of an operations pop and top
operation of ADT may cannot be performed if
sometimes cause an the stack is empty
error condition, called • Attempting the
an exception execution of pop or top
Exceptions are said to be on an empty stack
“thrown” by an throws an
operation that cannot EmptyStackException
be executed
11/08/2022 22
Applications of Stacks

Direct applications
◼ Page-visited history in a Web browser
◼ Undo sequence in a text editor
◼ Chain of method calls in programs
Indirect applications
◼ Auxiliary data structure for algorithms
◼ Component of other data structures

11/08/2022 23
Array-based Stack
A simple way of Algorithm size()
implementing the return t + 1
Stack ADT uses an
array Algorithm pop()
We add elements if isEmpty() then
from left to right throw
A variable keeps track EmptyStackException
of the index of the else
top element tt−1
return S[t + 1]

S
0 1 2 t
11/08/2022 24
Array-based Stack (cont.)
The array storing the
stack elements may
become full Algorithm push(o)
if t = S.length − 1 then
A push operation will
then throw a throw
FullStackException FullStackException
◼ Limitation of the array- else
based implementation tt+1
◼ Not intrinsic to the Stack S[t]  o
ADT


S
0 1 2 t
11/08/2022 25
Performance and Limitations
Performance
◼ Let n be the number of elements in the stack
◼ The space used is O(n)
◼ Each operation runs in time O(1)
Limitations
◼ The maximum size of the stack must be defined a
priori and cannot be changed
◼ Trying to push a new element into a full stack causes
an implementation-specific exception

11/08/2022 26
Computing Spans
7
We show how to use a stack as 6
an auxiliary data structure in an 5
algorithm
4
Given an an array X, the span
S[i] of X[i] is the maximum 3
number of consecutive 2
elements X[j] immediately 1
preceding X[i] and such that 0
X[j]  X[i] ( j i )
0 1 2 3 4
Spans have applications to
financial analysis
X 6 3 4 5 2
◼ E.g., stock at 52-week high
S 1 1 2 3 1
11/08/2022 27
Quadratic Algorithm
Algorithm spans1(X, n)
Input array X of n integers
Output array S of spans of X #
S  new array of n integers n
for i  0 to n − 1 do n
s1 n
while s  i  X[i − s]  X[i] 1 + 2 + …+ (n − 1)
ss+1 1 + 2 + …+ (n − 1)
S[i]  s n
return S 1

Algorithm spans1 runs in O(n2) time


11/08/2022 28
Computing Spans with a Stack
We keep in a stack the
7
indices of the elements
visible when “looking back”
6
5
We scan the array from left
to right
4
◼ Let i be the current index
3
◼ We pop indices from the 2
stack until we find index j 1
such that X[i]  X[j] 0
◼ We set S[i]  i − j
0 1 2 3 4 5 6 7
◼ We push i onto the stack

11/08/2022 29
Linear Algorithm
Each index of the Algorithm spans2(X, n) #
array S  new array of n integers n
◼ Is pushed into the A  new empty stack 1
stack exactly once for i  0 to n − 1 do n
◼ Is popped from while (A.isEmpty() 
the stack at most X[A.top()]  X[i]) do n
once
A.pop() n
The statements in if A.isEmpty() then n
the while-loop are S[i]  i + 1 n
executed at most else
n times S[i]  i − A.top() n
Algorithm spans2 A.push(i) n
runs in O(n) time return S 1

11/08/2022 30
Growable Array-based Stack
In a push operation, when the Algorithm push(o)
array is full, instead of if t = S.length − 1 then
throwing an exception, we A  new array
can replace the array with a of
larger one
How large should the new size …
array be? for i  0 to t do
◼ incremental strategy: increase A[i] 
the size by a constant c S[i]
◼ doubling strategy: double the SA
size tt+1
S[t]  o
11/08/2022 31
Growable Array-based Stack
In a push operation, when the Algorithm push(o)
array is full, instead of if t = S.length − 1 then
throwing an exception, we A  new array of
can replace the array with a size …
larger one for i  0 to t do
How large should the new A[i]  S[i]
array be? SA
◼ incremental strategy: increase tt+1
the size by a constant c S[t]  o
◼ doubling strategy: double the
size

11/08/2022 32
Comparison of the Strategies

We compare the incremental strategy and the


doubling strategy by analyzing the total time T(n)
needed to perform a series of n push operations
We assume that we start with an empty stack
represented by an array of size 1
We call amortized time of a push operation the
average time taken by a push over the series of
operations, i.e., T(n)/n

11/08/2022 33
Incremental Strategy Analysis

We replace the array k = n/c times


The total time T(n) of a series of n push operations
is proportional to
n + c + 2c + 3c + 4c + … + kc =
n + c(1 + 2 + 3 + … + k) =
n + ck(k + 1)/2
Since c is a constant, T(n) is O(n + k2), i.e., O(n2)
The amortized time of a push operation is O(n)

11/08/2022 34
Doubling Strategy Analysis
We replace the array k = log2 n
times
The total time T(n) of a series of n geometric series
push operations is proportional to 2
n + 1 + 2 + 4 + 8 + …+ 2k = 4
1 1
n+2 k + 1 −1 = 2n −1
T(n) is O(n)
8
The amortized time of a push
operation is O(1)

11/08/2022 35
Stack Of Cups

top F

top E E

D D

C C

B B

bottom A bottom A

• Add a cup to the stack.


• Remove a cup from new stack.
• A stack is a LIFO list.
11/08/2022 36
Towers Of Hanoi/Brahma

4
3
2
1

A B C
• 64 gold disks to be moved from tower A to tower C
• each tower operates as a stack
• cannot
11/08/2022
place big disk on top of a smaller one 37
Towers Of Hanoi/Brahma

3
2
1

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 38
Towers Of Hanoi/Brahma

2
1 3

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 39
Towers Of Hanoi/Brahma

1 2 3

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 40
Towers Of Hanoi/Brahma

3
1 2

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 41
Towers Of Hanoi/Brahma

3
2 1

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 42
Towers Of Hanoi/Brahma

3 2 1

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 43
Towers Of Hanoi/Brahma

2
3 1

A B C
• 3-disk Towers Of Hanoi/Brahma
11/08/2022 44
Towers Of Hanoi/Brahma

3
2
1

A B C
• 3-disk Towers Of Hanoi/Brahma
• 7 disk moves
11/08/2022 45
Recursive Solution

A B C
• n > 0 gold disks to be moved from A to C using B
• move top n-1 disks from A to B using C

11/08/2022 46
Recursive Solution

A B C
• move top disk from A to C
11/08/2022 47
Recursive Solution

A B C
• move top n-1 disks from B to C using A
11/08/2022 48
Recursive Solution

A B C
• moves(n) = 0 when n = 0
• moves(n) = 2*moves(n-1) + 1 = 2n-1 when n > 0

11/08/2022 49
Towers Of Hanoi/Brahma

• moves(64) = 1.8 * 1019 (approximately)


• Performing 109 moves/second, a computer would take about
570 years to complete.
• At 1 disk move/min, the monks will take about 3.4 * 1013 years.

11/08/2022 50
Chess Story

• 1 grain of rice on the first square, 2 for next, 4 for


next, 8 for next, and so on.
• Surface area needed exceeds surface area of earth.

11/08/2022 51
Chess Story

• 1 penny for the first square, 2 for next, 4 for next, 8


for next, and so on.
• $3.6 * 1017 (federal budget ~ $2 * 1012) .

11/08/2022 52
Switch Box Routing
1 2 3 4 5 6 7 8 9 10
40 11
39 12
38 13
37 14
36 15
Routing region
35 16
34 17
33 18
32 19
31 20
30 29 28 27 26 25 24 23 22 21
11/08/2022 53
Routing A 2-pin Net
1 2 3 4 5 6 7 8 9 10
40 11
Routing Routing
39 12 for pins
for pins
1-3 and 38 13 5
18-40 is 37 14 through
confined 16 is
36 15
to lower confined
35 16 to upper
left
region. 34 17 right
33 18 region.
32 19
31 20
30 29 28 27 26 25 24 23 22 21
11/08/2022 54
Routing A 2-pin Net
1 2 3 4 5 6 7 8 9 10
40 11
(u,v), Examine
39 12 pins in
u<v is a
2-pin 38 13 clock-
net. 37 14 wise
order
u is start 36 15
beginn-
pin. 35 16 ing with
v is end 34 17 pin 1.
pin. 33 18
32 19
31 20
30 29 28 27 26 25 24 23 22 21
11/08/2022 55
Routing A 2-pin Net
1 2 3 4 5 6 7 8 9 10
40 11
Start pin
39 12
=> push
onto 38 13
stack. 37 14

End pin 36 15
=> start 35 16
pin must 34 17
be at top
33 18
of stack.
32 19
31 20
30 29 28 27 26 25 24 23 22 21
11/08/2022 56
Method Invocation And Return
public void a()
{ …; b(); …}
public void b()
{ …; c(); …} return address in d()
public void c() return address in c()
{ …; d(); …} return address in e()
public void d() return address in d()
{ …; e(); …} return address in c()
return address in b()
public void e()
return address in a()
{ …; c(); …}
11/08/2022 57
Try-Throw-Catch
• When you enter a try block, push the address of
this block on a stack.
• When an exception is thrown, pop the try block
that is at the top of the stack (if the stack is
empty, terminate).
• If the popped try block has no matching catch
block, go back to the preceding step.
• If the popped try block has a matching catch
block, execute the matching catch block.

11/08/2022 58
Rat In A Maze

11/08/2022 59
Rat In A Maze

• Move order is: right, down, left, up


• Block positions to avoid revisit.

11/08/2022 60
Rat In A Maze

• Move order is: right, down, left, up


• Block positions to avoid revisit.

11/08/2022 61
Rat In A Maze

• Move backward until we reach a square from which a forward move is


possible.
11/08/2022 62
Rat In A Maze

• Move down.
11/08/2022 63
Rat In A Maze

• Move left.
11/08/2022 64
Rat In A Maze

• Move down.
11/08/2022 65
Rat In A Maze

• Move backward until we reach a square from which a forward move is


possible.
11/08/2022 66
Rat In A Maze

• Move backward until we reach a square from which a


forward move is possible.
• Move downward.
11/08/2022 67
Rat In A Maze

• Move right.
• Backtrack.
11/08/2022 68
Rat In A Maze

• Move downward.
11/08/2022 69
Rat In A Maze

• Move right.
11/08/2022 70
Rat In A Maze

• Move one down and then right.


11/08/2022 71
Rat In A Maze

• Move one up and then right.


11/08/2022 72
Rat In A Maze

• Move down to exit and eat cheese.


• Path from maze entry to current position operates as a stack.

11/08/2022 73
Queues

11/08/2022 74
The Queue ADT
The Queue ADT stores arbitrary • Auxiliary queue operations:
objects object front(): returns the
Insertions and deletions follow the element at the front without
first-in first-out scheme removing it
Insertions are at the rear of the integer size(): returns the
number of elements stored
queue and removals are at the
front of the queue boolean isEmpty(): indicates
whether no elements are
Main queue operations: stored
◼ enqueue(object): inserts an
element at the end of the queue
Exceptions
Attempting the execution of
◼ object dequeue(): removes and
dequeue or front on an empty
returns the element at the front of
queue throws an
the queue
EmptyQueueException

11/08/2022 75
Applications of Queues

Direct applications
◼ Waiting lists, bureaucracy
◼ Access to shared resources (e.g., printer)
◼ Multiprogramming
Indirect applications
◼ Auxiliary data structure for algorithms
◼ Component of other data structures

11/08/2022 76
Array-based Queue
Use an array of size N in a circular fashion
Two variables keep track of the front and rear
f index of the front element
r index immediately past the rear element
Array location r is kept empty

normal configuration
Q
0 1 2 f r

wrapped-around configuration
Q
0 1 2 r f
11/08/2022 77
Queue Operations
We use the modulo Algorithm size()
operator return (N − f + r) mod N
(remainder of
Algorithm isEmpty()
division) return (f = r)

Q
0 1 2 f r
Q
0 1 2 r f

11/08/2022 78
Queue Operations (cont.)
Operation enqueue Algorithm enqueue(o)
throws an exception if if size() = N − 1 then
the array is full throw
This exception is FullQueueException
implementation-
dependent else
Q[r]  o
r  (r + 1) mod N
Q
0 1 2 f r
Q
0 1 2 r f

11/08/2022 79
Queue Operations (cont.)
Operation dequeue Algorithm dequeue()
throws an exception if if isEmpty() then
the queue is empty throw
This exception is EmptyQueueException
specified in the queue else
ADT o  Q[f]
f  (f + 1) mod N
return o
Q
0 1 2 f r
Q
0 1 2 r f
11/08/2022 80
Growable Array-based Queue
In an enqueue operation, when the array is full,
instead of throwing an exception, we can
replace the array with a larger one
Similar to what we did for an array-based stack
The enqueue operation has amortized running
time
◼ O(n) with the incremental strategy
◼ O(1) with the doubling strategy

11/08/2022 81
Deques

• New items can be added at either the front or


the rear.
• Existing items can be removed from either
end.
• Hybrid linear structure
• Provides all the capabilities of stacks and
queues in a single data structure.

11/08/2022 82
Deques

11/08/2022 83
Deques

11/08/2022 84
Deque Implementation

• Implementation using Python lists


• Our implementation will assume that the rear
of the deque is at position 0 in the list.

11/08/2022 85
Deque Implementation

• class Deque:
def __init__(self):
self.items = []
def is_empty(self):
return self.items == []
def add_front(self, item):
self.items.append(item)

11/08/2022 86
Deque Implementation

• def add_rear(self, item):


self.items.insert(0,item)
def remove_front(self):
return self.items.pop()
def remove_rear(self):
return self.items.pop(0)
def size(self):
return len(self.items)

11/08/2022 87
Deques

11/08/2022 88
Deque Application

• Palindrome checker
• Palindrome is a string that reads the same
forwards and backwards
• Examples
• MALAYALAM
• RADAR

11/08/2022 89
Deques

11/08/2022 90
HEAPS

11/08/2022 91
Heaps
• A heap can be seen as a complete binary tree:
16

14 10

8 7 9 3

2 4 1

– What makes a binary tree complete?


– Is the example above complete?
Heaps
• A heap can be seen as a complete binary tree:
16

14 10

8 7 9 3

2 4 1 1 1 1 1 1

– The book calls them “nearly complete” binary


trees; can think of unfilled slots as null pointers
Heaps
• In practice, heaps are usually implemented as
arrays:
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Heaps
• To represent a complete binary tree as an
array:
– The root node is A[1]
– Node i is A[i]
– The parent of node i is A[i/2] (note: integer divide)
– The left child of node i is A[2i]
– The right child of node i is A[2i + 1] 16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Referencing Heap Elements
• So…
Parent(i) { return i/2; }
Left(i) { return 2*i; }
right(i) { return 2*i + 1; }
• An aside: How would you implement this
most efficiently?
• Another aside: Really?
The Heap Property
• Heaps also satisfy the heap property:
A[Parent(i)]  A[i] for all nodes i > 1
– In other words, the value of a node is at most the
value of its parent
– Where is the largest element in a heap stored?
• Definitions:
– The height of a node in the tree = the number of
edges on the longest downward path to a leaf
– The height of a tree = the height of its root
Heap Height
• What is the height of an n-element heap?
Why?
• This is nice: basic heap operations take at
most time proportional to the height of the
heap
Heap Operations: Heapify()
• Heapify(): maintain the heap property
– Given: a node i in the heap with children l and r
– Given: two subtrees rooted at l and r, assumed to
be heaps
– Problem: The subtree rooted at i may violate the
heap property (How?)
– Action: let the value of the parent node “float
down” so subtree at i satisfies the heap property
• What do you suppose will be the basic operation
between i, l, and r?
Heap Operations: Heapify()
Heapify(A, i)
{
l = Left(i); r = Right(i);
if (l <= heap_size(A) && A[l] > A[i])
largest = l;
else
largest = i;
if (r <= heap_size(A) && A[r] > A[largest])
largest = r;
if (largest != i)
Swap(A, i, largest);
Heapify(A, largest);
}
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Analyzing Heapify(): Informal
• Aside from the recursive call, what is the
running time of Heapify()?
• How many times can Heapify() recursively
call itself?
• What is the worst-case running time of
Heapify() on a heap of size n?
Analyzing Heapify(): Formal
• Fixing up relationships between i, l, and r
takes (1) time
• If the heap at i has n elements, how many
elements can the subtrees at l or r have?
– Draw it
• Answer: 2n/3 (worst case: bottom row 1/2
full)
• So time taken by Heapify() is given by
T(n)  T(2n/3) + (1)
2/3 bound
• In the worst case, for the heap with i internal
nodes and r leaves,
• 2i = r + i – 1
Fraction of nodes in left subtree =
• => r = i + 1 (2m+1)/(3m+2) ≤ 2/3

m m

m +1
Analyzing Heapify(): Formal
• So we have
T(n)  T(2n/3) + (1)
• By case 2 of the Master Theorem,
T(n) = O(lg n)
• Thus, Heapify() takes logarithmic time
Heap Operations: BuildHeap()
• We can build a heap in a bottom-up manner
by running Heapify() on successive
subarrays
– Fact: for array of length n, all elements in range
A[n/2 + 1 .. n] are heaps (Why?)
– So:
• Walk backwards through the array from n/2 to 1, calling
Heapify() on each node.
• Order of processing guarantees that the children of
node i are heaps when i is processed
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)
Heapify(A, i);
}
BuildHeap() Example
• Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1 3

2 16 9 10

14 8 7
Analyzing BuildHeap()
• Each call to Heapify() takes O(lg n) time
• There are O(n) such calls (specifically, n/2)
• Thus the running time is O(n lg n)
– Is this a correct asymptotic upper bound?
– Is this an asymptotically tight bound?
• A tighter bound is O(n)
– How can this be? Is there a flaw in the above
reasoning?
Analyzing BuildHeap(): Tight
• To Heapify() a subtree takes O(h) time
where h is the height of the subtree
– h = O(lg m), m = # nodes in subtree
– The height of most subtrees is small
• Fact: an n-element heap has at most n/2h+1
nodes of height h
• Can be used to prove that BuildHeap()
takes O(n) time
Heapsort
• Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
– Maximum element is at A[1]
– Discard by swapping with element at A[n]
• Decrement heap_size[A]
• A[n] now contains correct value
– Restore heap property at A[1] by calling
Heapify()
– Repeat, always swapping A[1] for A[heap_size(A)]
Heapsort
Heapsort(A)
{
BuildHeap(A);
for (i = length(A) downto 2)
{
Swap(A[1], A[i]);
heap_size(A) -= 1;
Heapify(A, 1);
}
}
Analyzing Heapsort
• The call to BuildHeap() takes O(n) time
• Each of the n - 1 calls to Heapify() takes
O(lg n) time
• Thus the total time taken by HeapSort()
= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)
Problems
• 1. Design an algorithm to merge k large sorted
arrays of a total of n elements using no more
than O(k) additional storage. Here k << n.
• 2. Design an algorithm which reads in n
integers one by one and then returns the k
smallest ones. Here k << n and n is unknown
to start with.
Priority Queues
• Heapsort is a nice algorithm, but in practice
Quicksort usually wins
• But the heap data structure is incredibly
useful for implementing priority queues
– A data structure for maintaining a set S of
elements, each with an associated value or key
– Supports the operations Insert(),
Maximum(), and ExtractMax()
– What might a priority queue be useful for?
Priority Queue Operations
• Insert(S, x) inserts the element x into set S
• Maximum(S) returns the element of S with
the maximum key
• ExtractMax(S) removes and returns the
element of S with the maximum key
• How could we implement these operations
using a heap?
Example of Insertion to Max Heap
• Add at the end, move upwards
20 20 21

15 2 15 5 15 20

14 10 14 10 2 14 10 2

initial location of new node insert 5 into heap insert 21 into heap

125
Example of Deletion from Max Heap
• EXTRACT_MAX
remove
20 10 15

15 2 15 2 14 2

14 10 14 10

126
Binary Search Trees

11/08/2022 127
Binary Trees
Binary tree is A
– a root
– left subtree (maybe empty)
– right subtree (maybe empty)
B C

Properties D E F
– max # of leaves:
– max # of nodes: G H
– average depth for N nodes:

Representation: Data I J
left right
pointer pointer

11/08/2022 128
Binary Tree Representation
A A
left right
child child

B C
B C
left right left right
child child child child D E F

D E F
left right left right left right
child child child child child child

11/08/2022 129
Dictionary ADT
Dictionary operations insert Adrien
Donald Roller-blade demon
– create l33t haxtor
Hannah
– destroy C++ guru
– insert find(Adrien) Dave
Adrien
– find Roller-blade demon Older than dirt
– delete …

Stores values associated with user-specified


keys
– values may be any (homogeneous) type
– keys may be any (homogeneous) comparable
type

11/08/2022 130
Dictionary ADT:
Used Everywhere
• Arrays
• Sets
• Dictionaries
• Router tables
• Page tables
• Symbol tables
• C++ structures
• …

Anywhere we need to find things fast based on a key

11/08/2022 131
Search ADT
Dictionary operations insert Adrien
Donald
– create Hannah
– destroy Dave
– insert …
find(Adrien)
– find Adrien
– delete

Stores only the keys


– keys may be any (homogenous) comparable
– quickly tests for membership

Simplified dictionary, useful for examples (e.g. CSE 326)

11/08/2022 132
Dictionary Data Structure:
Requirements
• Fast insertion
– runtime:

• Fast searching
– runtime:

• Fast deletion
– runtime:
11/08/2022 133
Naïve Implementations
unsorted sorted linked list
array array
insert O(n) find + O(n) O(1)

find O(n) O(log n) O(n)

delete find + O(1) find + O(1) find + O(1)


(mark-as-deleted) (mark-as-deleted)

11/08/2022 134
Binary Search Tree
Dictionary Data Structure
Binary tree property
– each node has  2 children 8
– result:
• storage is small
• operations are simple 5 11
• average depth is small
Search tree property
– all keys in left subtree smaller than 2 6 10 12
root’s key
– all keys in right subtree larger than
root’s key 4 7 9 14
– result:
• easy to find any given key
• Insert/delete by changing links 13

11/08/2022 135
Example and Counter-Example
8
5
5 11
4 8
2 7 6 10 18
1 7 11
4 15 20
3
NOT A 21
BINARY SEARCH TREE BINARY SEARCH TREE

11/08/2022 136
Complete Binary Search Tree
Complete binary search tree
(aka binary heap):
– Links are completely filled, 8
except possibly bottom level,
which is filled left-to-right.
5 15

3 7 9 17

1 4 6

11/08/2022 137
In-Order Traversal

10
visit left subtree
visit node
5 15
visit right subtree

2 9 20
What does this guarantee
with a BST?
7 17 30

In order listing:
2→5→7→9→10→15→17→20→30

11/08/2022 138
Recursive Find
10
Node *
find(Comparable key, Node * t)
5 15 {
if (t == NULL) return t;
2 9 20
else if (key < t->key)
return find(key, t->left);
7 17 30
else if (key > t->key)
return find(key, t->right);
Runtime: else
Best-worse case? return t;
Worst-worse case? }
f(depth)?
11/08/2022 139
Iterative Find
10
Node *
find(Comparable key, Node * t)
5 15 {
while (t != NULL && t->key != key)
{
2 9 20 if (key < t->key)
t = t->left;
7 17 30 else
t = t->right;
}

return t;
}

11/08/2022 140
Insert
void
Concept: insert(Comparable x, Node * t)
▪ Proceed down tree {
as in Find if ( t == NULL ) {

▪ If new key not t = new Node(x);

found, then insert a } else if (x < t->key) {


new node at last insert( x, t->left );
spot traversed
} else if (x > t->key) {
insert( x, t->right );

} else {
// duplicate
// handling is app-dependent
}

11/08/2022 141
BuildTree for BSTs
• Suppose the data 1, 2, 3, 4, 5, 6, 7, 8, 9 is
inserted into an initially empty BST:
– in order

– in reverse order

– median first, then left median, right median, etc.

11/08/2022 142
Analysis of BuildTree
Worst case is O(n2)

1 + 2 + 3 + … + n = O(n2)

Average case assuming all orderings equally likely:


O(n log n)
– averaging over all insert sequences (not over all binary trees)
– equivalently: average depth of a node is log n
– proof: see Introduction to Algorithms, Cormen, Leiserson, & Rivest

11/08/2022 143
BST Bonus:
FindMin, FindMax
• Find minimum
10

5 15
• Find maximum
2 9 20

7 17 30

11/08/2022 144
Successor Node
Next larger node
10
in this node’s subtree
Node * succ(Node * t) { 5 15
if (t->right == NULL)
return NULL;
else 2 9 20
return min(t->right);
}
7 17 30

How many children can the successor of a node have?


11/08/2022 145
Predecessor Node
Next smaller node
10
in this node’s subtree
Node * pred(Node * t) { 5 15
if (t->left == NULL)
return NULL;
else 2 9 20
return max(t->left);
}
7 17 30

11/08/2022 146
Deletion
10

5 15

2 9 20

7 17 30

Why might deletion be harder than insertion?

11/08/2022 147
Lazy Deletion
Instead of physically deleting nodes,
just mark them as deleted
+ simpler
+ physical deletions done in batches 10
+ some adds just flip deleted flag
- extra memory for deleted flag 5 15
- many lazy deletions slow finds
- some operations may have to be
modified (e.g., min and max) 2 9 20

7 17 30

11/08/2022 148
Lazy Deletion
Delete(17)
10
Delete(15)

Delete(5) 5 15

Find(9) 2 9 20
Find(16)
7 17 30
Insert(5)

Find(17)

11/08/2022 149
Deletion - Leaf Case

Delete(17) 10

5 15

2 9 20

7 17 30

11/08/2022 150
Deletion - One Child Case

Delete(15) 10

5 15

2 9 20

7 30

11/08/2022 151
Deletion - Two Child Case
Replace node with descendant
whose value is guaranteed to be 10
between left and right subtrees:
the successor 5 20

2 9 30

Delete(5)
7

Could we have used predecessor instead?

11/08/2022 152
Delete Code
void delete(Comparable key, Node *& root) {
Node *& handle(find(key, root));
Node * toDelete = handle;
if (handle != NULL) {
if (handle->left == NULL) { // Leaf or one child
handle = handle->right;
delete toDelete;
} else if (handle->right == NULL) { // One child
handle = handle->left;
delete toDelete;
} else { // Two children
successor = succ(root);
handle->data = successor->data;
delete(successor->data, handle->right);
}
}
}

11/08/2022 153
Thinking about
Binary Search Trees
Observations
– Each operation views two new elements at a time
– Elements (even siblings) may be scattered in memory
– Binary search trees are fast if they’re shallow

Realities
– For large data sets, disk accesses dominate runtime
– Some deep and some shallow BSTs exist for any data

11/08/2022 154
Beauty is Only (log n) Deep
Binary Search Trees are fast if they’re shallow:
– perfectly complete
– complete – possibly missing some “fringe” (leaves)
– any other good cases?

What matters?
– Problems occur when one branch is much longer than another
– i.e. when tree is out of balance

11/08/2022 155
Dictionary Implementations
unsorted sorted linked BST
array array list
insert O(n) find + O(n) O(1) O(Depth)

find O(n) O(log n) O(n) O(Depth)

delete find + O(1) find + O(1) find + O(1) O(Depth)


(mark-as-deleted) (mark-as-deleted)

BST’s looking good for shallow trees, i.e. if Depth is small (log n);
otherwise as bad as a linked list!

11/08/2022 156
Digression: Tail Recursion
• Tail recursion: when the tail (final operation)
of a function recursively calls the function

• Why is tail recursion especially bad with a


linked list?

• Why might it be a lot better with a tree? Why


might it not?

11/08/2022 157
Making Trees Efficient:
Possible Solutions
Keep BSTs shallow by maintaining “balance”
→AVL trees

… also exploit most-recently-used (mru) info


→Splay trees

Reduce disk access by increasing branching


factor
→B-trees
11/08/2022 158
JSON
About Jeff Fox (@jfox015)
• 16 year web development professional

• (Almost) entirely self taught

• Has used various Ajax-esque data technologies


since 2000, including XML, MS data islands and
AMF for Flash

• Develops JavaScript based web apps that rely on


JSON for data workflow
Overview
• What is JSON?

• Comparisons with XML

• Syntax

• Data Types

• Usage

• Live Examples
What is JSON?
JSON is…

• A lightweight text based data-interchange


format

• Completely language independent

• Based on a subset of the JavaScript


Programming Language

• Easy to understand, manipulate and generate


JSON is NOT…

• Overly Complex

• A “document” format

• A markup language

• A programming language
Why use JSON?
• Straightforward syntax

• Easy to create and manipulate

• Can be natively parsed in JavaScript using eval()

• Supported by all major JavaScript frameworks

• Supported by most backend technologies


JSON vs. XML
Much Like XML
• Plain text formats

• “Self-describing“ (human readable)

• Hierarchical (Values can contain lists of objects


or values)
Not Like XML
• Lighter and faster than XML

• JSON uses typed objects. All XML values are type-


less strings and must be parsed at runtime.

• Less syntax, no semantics

• Properties are immediately accessible to


JavaScript code
Knocks against JSON

• Lack of namespaces

• No inherit validation (XML has DTD and


templates, but there is JSONlint)

• Not extensible

• It’s basically just not XML


Syntax
JSON Object Syntax
• Unordered sets of name/value pairs

• Begins with { (left brace)

• Ends with } (right brace)

• Each name is followed by : (colon)

• Name/value pairs are separated by , (comma)


JSON Example

var employeeData = {
"employee_id": 1234567,
"name": "Jeff Fox",
"hire_date": "1/1/2013",
"location": "Norwalk, CT",
"consultant": false
};
Arrays in JSON
• An ordered collection of values

• Begins with [ (left bracket)

• Ends with ] (right bracket)

• Name/value pairs are separated by , (comma)


JSON Array Example
var employeeData = {
"employee_id": 1236937,
"name": "Jeff Fox",
"hire_date": "1/1/2013",
"location": "Norwalk, CT",
"consultant": false,
"random_nums": [ 24,65,12,94 ]
};
Data Types
Data Types: Strings

• Sequence of 0 or more Unicode characters

• Wrapped in "double quotes“

• Backslash escapement
Data Types: Numbers
• Integer

• Real

• Scientific

• No octal or hex

• No NaN or Infinity – Use null instead.


Data Types: Booleans & Null
• Booleans: true or false

• Null: A value that specifies nothing or no


value.
Data Types: Objects & Arrays

• Objects: Unordered key/value pairs wrapped


in { }

• Arrays: Ordered key/value pairs wrapped in [ ]


JSON Usage
How & When to use JSON

• Transfer data to and from a server

• Perform asynchronous data calls without


requiring a page refresh

• Working with data stores

• Compile and save form or user data for local


storage
Where is JSON used today?
• Anywhere and everywhere!

And many,
many more!

You might also like