Sorting and Searching Algorithms
Sorting and Searching Algorithms
Objectives
To study and analyze time efficiency of various sorting
algorithms
To design, implement, and analyze bubble sort.
To design, implement, and analyze merge sort
To design, implement, and analyze quick sort.
Linear Search
One by one...
Linear Search
Check every element in the list, until the
target is found
For example, our target is 38:
i 0 1 2 3 4 5
a[i] 25 14 9 38 77 45
Not found!
Found!
Linear Search
1) Initilize an index variable i
2) Compare a[i] with target
• If a[i]==target, found
• If a[i]!=target,
• If all have checked already, not found
• Otherwise, change i into next index and go to step
2
Linear Search
Time complexity in worst case?
– If N is number of elements,
– Time complexity = O(N)
Advantage?
Disadvantage?
Binary Search
Chop by half...
Binary Search
Given a SORTED list:
(Again, our target is 38)
Smaller! Found! Larger!
i 0 1 2 3 4 5
a[i] 9 14 25 38 45 77
L R
Binary Search
Why always in the middle, but not other
positions, say one-third of list?
(a) 1st pass (b) 2nd pass (c) 3rd pass (d) 4th pass (e) 5th pass
n2 n Run
(n 1) (n 2) ... 2 1
2 2
Merge Sort
2 9 5 4 8 1 67
split
2 9 5 4 8 1 6 7
split divide
2 9 5 4 8 1 6 7
split
2 9 5 4 8 1 6 7
merge
2 9 4 5 1 8 6 7
conquer
merge
2 4 5 9 1 6 7 8
MergeSort
merge
1 2 4 5 6 7 89 Run
Merge Two Sorted Lists
2 4 5 9 1 6 7 8 2 4 5 9 1 6 7 8 2 4 5 9 1 6 7 8
1 1 2 4 5 6 7 8 1 2 4 5 6 7 8 9
pivot pivot
pivot
(c) The partial array (4 2 1 3 0) is
0 2 1 3 4
partitioned
pivot
5 2 9 3 8 4 0 1 6 7
high
pivot
Run
4 2 1 3 0 5 8 9 6 7 (g) pivot is in the right place
time:
(n 1) (n 2) ... 2 1 O(n 2 )
Best-Case Time
In the best case, each time the pivot divides the
array into two parts of about the same size. Let
T(n) denote the time required for sorting an array
of elements using quick sort. So,
n n
T ( n ) T ( ) T ( ) n O ( n log n )
2 2
Average-Case Time
On the average, each time the pivot will not
divide the array into two parts of the same size
nor one empty part. Statistically, the sizes of the
two parts are very close. So the average time is
O(nlogn). The exact average-case analysis is
beyond the scope of this book.
Heap
Heap is a useful data structure for designing efficient
sorting algorithms and priority queues. A heap is a binary
tree with the following properties:
42 39 42 42
32 39 32 42 32 39 32
22 29 14 33 22 29 14 22 14 33 22 29
Representing a Heap
For a node at position i, its left child is at position 2i+1 and
its right child is at position 2i+2, and its parent is (i-1)/2.
For example, the node for element 39 is at position 4, so its
left child (element 14) is at 9 (2*4+1), its right child
(element 33) is at 10 (2*4+2), and its parent (element 42) is
at 1 ((4-1)/2).
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10][11][12][13]
62 [10][11]
62 42 59 32 39 44 13 22 29 14 33 30 17 9
left
42 59 parent
right
32 39 44 13
22 29 14 33 30 17 9
Adding Elements to the Heap
3 5 5
3 3 1
(a) After adding 3 (b) After adding 5 (c) After adding 1
19 19 22
5 1 11 1 11 19
3 3 5 3 5 1
(d) After adding 19 (e) After adding 11 (f) After adding 22
Rebuild the heap after adding a new node
11 19 11 88 11 22
3 5 1 88 3 5 1 19 3 5 1 19
(a) Add 88 to a heap (b) After swapping 88 with 19 (b) After swapping 88 with 22
Removing the Root and Rebuild the Tree
Removing root 62 from the heap
62
42 59
32 39 44 13
22 29 14 33 30 17 9
Removing the Root and Rebuild the Tree
Move 9 to root
42 59
32 39 44 13
22 29 14 33 30 17
Removing the Root and Rebuild the Tree
Swap 9 with 59
59
42 9
32 39 44 13
22 29 14 33 30 17
Removing the Root and Rebuild the Tree
Swap 9 with 44
59
42 44
32 39 9 13
22 29 14 33 30 17
Removing the Root and Rebuild the Tree
Swap 9 with 30
59
42 44
32 39 30 13
22 29 14 33 9 17
The Heap Class
Heap<E>
-list: java.util.ArrayList<E>
+Heap() Creates a default empty heap.
+Heap(objects: E[]) Creates a heap with the specified objects.
+add(newObject: E): void Adds a new object to the heap.
+remove(): E Removes the root from the heap and returns it.
+getSize(): int Returns the size of the heap.
HeapSort Run
Heap Sort Time
Let h denote the height for a heap of n elements.
Since a heap is a complete binary tree, the first
level has 1 node, the second level has 2 nodes,
the kth level has 2(k-1) nodes, the (h-1)th level has
2(h-2) nodes, and the hth level has at least one node
and at most 2(h-1) nodes. Therefore,
1 2 ... 2h 2 n 1 2 ... 2h 2 2h 1
2 h 1 h
1 n 2 1 2h 1 n 1 2 h log 2 h 1 log( n 1) log 2 h
Array
Temporary file
……
S1 S2 Sk
Phase II
Merge a pair of sorted segments (e.g., S1 with S2,
S3 with S4, ..., and so on) into a larger sorted
segment and save the new segment into a new
temporary file. Continue the same process until
one sorted segment results.
S1 S2 S3 S4 S5 S6 S7 S8
Sk
Merge
S1, S2 merged S3, S4 merged S5, S6 merged S7, S8 merged
Merge
S1, S2, S3, S4 merged S5, S6 , S7, S8 merged
Merge
S1, S2, S3, S4 , S5, S6 , S7, S8 merged
Implementing Phase II
Each merge step merges two sorted segments to
form a new segment. The new segment doubles the
number elements. So the number of segments is
reduced by half after each merge step. A segment is
too large to be brought to an array in memory. To
implement a merge step, copy half number of
segments from file f1.dat to a temporary file f2.dat.
Then merge the first remaining segment in f1.dat
with the first segment in f2.dat into a temporary
file named f3.dat.
Implementing Phase II
S1 S2 S3 S4 S5 S6 S7 S8 f1.dat
Sk
Copy to f2.dat
S1 S2 S3 S4 f2.dat