Search in C
Search in C
Binary search
large arrays sorted arrays
Start at first element of array. Compare value to value (key) for which you are searching Continue with next element of the array until you find a match or reach the last element in the array.
Note: On the average you will have to compare the search key with half the elements in the array.
6 5 4 3 2 1
21 13 8 5 3 2
target
21 13 8 5 3 2
target
Otherwise, return 1
In the best case, the target is the first entry. It only takes 1 step
In the worst case, the target is the last entry, or it cannot be found.
Sequential search has to scan all n records and takes totally n steps.
Cost: O(n)
Binary Search If the records are sorted by the key, we can do better.
Think about how you look up a word in English dictionary where the words are sorted alphabetically.
If the target is present in the list, it must be one between low and high
A B C
low=0
M
middle=12
X Y Z
high=25
A B C
low=0
M
middle
X Y Z
high=25
If the target is the same as the key value in the middle position, we have found the record. Return the value of middle
A B C
low=0
L M
high=11
X Y Z
If target < the key value in the middle position, we know that it can not be found in the right half. We only need to search in the left half. high = middle - 1;
A B C
M N
low=13
X Y Z
high=25
If target > the key value in the middle position, we know that it can not be found in the left half. We only need to search the right half. low = middle + 1;
Binary Search Starting our search in the middle with a sorted array.
A constant amount of work allows us to divide the data in half.
Keep splitting the array in half until result is found Array must initially be ordered
Consider a Dictionary
Looking up a word in a dictionary In what sense is it like a binary search? In what sense not? What is the complexity??
Complexity For Binary Search Suppose there are n records In the best case, the target is the middle entry. It only takes 1 step In the worst case, it will take log2n + 1 steps. Cost: O(logN)
Has O(log n) complexity for n stored items Frequently used for reasonably-sized data sets - E.g. more than 10-20 stored items