0% found this document useful (0 votes)
7 views27 pages

8.0 Searching

Uploaded by

Sana Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views27 pages

8.0 Searching

Uploaded by

Sana Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Data Structure and

Algorithm (IT-209)
Instructor: Abdullah Javed
([email protected])
Lecturer,
Govt. Postgraduate College, Jhelum

Lecture 8.0
Fall 2020
Agenda
• Common Problems in CS
• Searching and Sorting
• Linear Search
• Binary Search
 Interpolation Search

2
Common Problems in Comp. Sci.
• There are some very common problems that we use
computers to solve:
 Searching through a lot of records for a specific record or set
of records
 Placing records in order, which we call sorting

• There are numerous algorithms to perform searches


and sorts. We will briefly explore a few common
ones.

3
Searching
• Searching data involves determining whether a value (referred to as the search
key) is present in the data and, if so, finding the value’s location.
• A question you should always ask when selecting a search
algorithm is “How fast does the search have to be?” The
reason is that, in general, the faster the algorithm is, the more
complex it is.
• Bottom line: you don’t always need to use or should use the
fastest algorithm.
• Two popular search algorithms are the simple linear search and the faster but
more complex binary search

4
Searching Algorithms
• Looking up a phone number, accessing a website and checking a
word’s definition in a dictionary all involve searching through large
amounts of data.
• Searching algorithms all accomplish the same goal—finding an
element that matches a given search key, if such an element does, in
fact, exist.
• The major difference is the amount of effort they require to complete
the search.
• One way to describe this effort is with Big O notation.
 For searching and sorting algorithms, this is particularly dependent on the
number of data elements.

5
Searching
• Given a list of data find the location of a particular value or report that
value is not present
• linear search
 intuitive approach
 start at first item
 is it the one I am looking for?
 if not go to next item
 repeat until found or all items checked

• If items not sorted or unsortable this approach is necessary

6
Searching
/* pre: list != null
post: return the index of the first occurrence of target in list or -1
if target not present in list

*/

public int linearSearch(int[] list, int target) {

for(int i = 0; i < list.length; i++)

if( list[i] == target )

return i;

return -1;

7
Efficiency of Linear Search
Big O: Constant Runtime
• Suppose an algorithm simply tests whether the first element of an array is equal to the
second element.
• If the array has 10 elements, this algorithm requires only one comparison.
• If the array has 1000 elements, the algorithm still requires only one comparison.
• In fact, the algorithm is independent of the number of array elements.
• This algorithm is said to have a constant runtime, which is represented in Big O notation as
O(1).
• An algorithm that is O(1) does not necessarily require only one comparison.
• O(1) just means that the number of comparisons is constant—it does not grow as the size of
the array increases.
• An algorithm that tests whether the first element of an array is equal to any of the next
three elements will always require three comparisons, but in Big O notation it’s still
considered O(1).
• O(1) is often pronounced “on the order of 1” or more simply “order 1.”

8
Efficiency of Linear Search (cont.)
Big O: Linear Runtime
• An algorithm that tests whether the first element of an array is equal
to any of the other elements of the array requires at most n – 1
comparisons, where n is the number of elements in the array.
• If the array has 10 elements, the algorithm requires up to nine
comparisons.
• If the array has 1000 elements, the algorithm requires up to 999
comparisons.
• As n grows larger, the n part of the expression n – 1 “dominates,” and
subtracting one becomes inconsequential.
• Big O is designed to highlight these dominant terms and ignore terms
that become unimportant as n grows.

9
Efficiency of Linear Search (cont.)
• An algorithm that requires a total of n – 1 comparisons is said to be
O(n) and is referred to as having a linear runtime.
• O(n) is often pronounced “on the order of n” or more simply “order n.”

10
Efficiency of Linear Search (cont.)
Big O: Quadratic Runtime
• Now suppose you have an algorithm that tests whether any element of an
array is duplicated elsewhere in the array.
• The first element must be compared with every other element in the array.
• The second element must be compared with every other element except the
first (it was already compared to the first).
• The third element then must be compared with every other element except
the first two.
• In the end, this algorithm will end up making (n – 1) + (n – 2) + … + 2 + 1 or
n2/2 – n/2 comparisons.
• As n increases, the n2 term dominates and the n term becomes
inconsequential.
• Again, Big O notation highlights the n 2 term, leaving n2/2.
11
Efficiency of Linear Search (cont.)
• Big O is concerned with how an algorithm’s runtime grows in relation to the number
of items processed.
• Suppose an algorithm requires n2 comparisons.
• With four elements, the algorithm will require 16 comparisons; with eight elements,
64 comparisons.
• With this algorithm, doubling the number of elements quadruples the number of
comparisons.
• Consider a similar algorithm requiring n2/2 comparisons.
• With four elements, the algorithm will require eight comparisons; with eight
elements, 32 comparisons.
• Again, doubling the number of elements quadruples the number of comparisons.
• Both of these algorithms grow as the square of n, so Big O ignores the constant, and
both algorithms are considered to be O(n2), which is referred to as quadratic runtime
and pronounced “on the order of n-squared” or more simply “order n-squared.”

12
Efficiency of Linear Search (cont.)
• O(n2) Performance
• When n is small, O(n2) algorithms will not noticeably affect
performance.
• As n grows, you’ll start to notice the performance degradation.
• An O(n2) algorithm running on a million-element array would require a
trillion “operations” (where each could actually require several
machine instructions to execute).
 This could require hours to execute.

• A billion-element array would require a quintillion operations, a


number so large that the algorithm could take decades!
Unfortunately, O(n2) algorithms tend to be easy to write.

13
Efficiency of Linear Search (cont.)
Linear Search’s Runtime
• The linear search algorithm runs in O(n) time.
• The worst case in this algorithm is that every element must be checked to
determine whether the search key is in the array.
• If the array’s size doubles, the number of comparisons that the algorithm
must perform also doubles.
• Linear search can provide outstanding performance if the element matching
the search key happens to be at or near the front of the array.
• But we seek algorithms that perform well, on average, across all searches,
including those where the element matching the search key is near the end
of the array.
• If a program needs to perform many searches on large arrays, it may be
better to implement a different, more efficient algorithm, such as the binary
search which we present in the next section.
14
Searching in a Sorted List
• If items are sorted then we can divide and conquer
• dividing your work in half with each step
 generally a good thing

• The Binary Search on List in Ascending order


 Start at middle of list
 is that the item?
 If not is it less than or greater than the item?
 less than, move to second half of list
 greater than, move to first half of list
 repeat until found or sub list size = 0

15
Binary Search
list

low item middle item high item


Is middle item what we are looking for? If not is it
more or less than the target item? (Assume lower)

list

low middle high


item item item
and so forth…
16
Binary Search in Action
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53
public static int bsearch(int[] list, int target)
{ int result = -1;
int low = 0; int mid;
int high = list.length - 1;
while( result == -1 && low <= high ){
mid = low + ((high - low) / 2);
if( list[mid] == target )
result = mid;

else if( list[mid] < target)

low = mid + 1;

else

high = mid - 1;

return result;} 17
Recursive Binary Search
int bsearch(int[] list, int target, int first, int last)

{ if( first <= last ){


int mid = last + ((first - last ) / 2);

if( list[mid] == target )

return mid;

else if( list[mid] > target )

return bsearch(list, target, first, mid – 1);

else

return bsearch(list, target, mid + 1, last);

return -1;

} 18
How Fast is a Binary Search?
• Worst case: 11 items in the list took 4 tries
• How about the worst case for a list with 32 items ?
 1st try - list has 16 items
 2nd try - list has 8 items
 3rd try - list has 4 items
 4th try - list has 2 items
 5th try - list has 1 item

19
How Fast is a Binary Search?
 List has 250 items  List has 512 items
 1st try - 125 items  1st try - 256 items
 2nd try - 63 items  2nd try - 128 items
 3rd try - 32 items  3rd try - 64 items
 4th try - 16 items  4th try - 32 items
 5th try - 8 items  5th try - 16 items
 6th try - 4 items  6th try - 8 items
 7th try - 2 items  7th try - 4 items
 8th try - 1 item  8th try - 2 items
 9th try - 1 item
20
What’s the Pattern?
• List of 11 took 4 tries
• List of 32 took 5 tries
• List of 250 took 8 tries
• List of 512 took 9 tries

• 32 = 25 and 512 = 29
• 8 < 11 < 16 23 < 11 < 24
• 128 < 250 < 256 27 < 250 < 28

21
A Very Fast Algorithm!
• How long (worst case) will it take to find an item in a list 30,000 items
long?

 210 = 1024 213 = 8192


 211 = 2048 214 = 16384
 212 = 4096 215 = 32768

• So, it will take only 15 tries!

22
Binary Search
Effi ciency of Binary Search
• In the worst-case scenario, searching a sorted array of 1023 elements
will take only 10 comparisons when using a binary search.
• Repeatedly dividing 1023 by 2 (because, after each comparison, we
can eliminate from consideration half of the remaining elements) and
rounding down (because we also remove the middle element) yields
the values 511, 255, 127, 63, 31, 15, 7, 3, 1 and 0.
• The number 1023 (210 – 1) is divided by 2 only 10 times to get the
value 0, which indicates that there are no more elements to test.

23
Binary Search (cont.)
• Dividing by 2 is equivalent to one comparison in the binary search
algorithm.
• Thus, an array of 1,048,575 (2 20 – 1) elements takes a maximum of 20
comparisons to find the key, and an array of approximately one billion
elements takes a maximum of 30 comparisons to find the key.
• This is a tremendous performance improvement over the linear
search.
• For a one-billion-element array, this is a difference between an
average of 500 million comparisons for the linear search and a
maximum of only 30 comparisons for the binary search!
• The maximum number of comparisons needed for the binary search of
any sorted array is the exponent of the first power of 2 greater than
the number of elements in the array, which is represented as log 2 n.

24
Binary Search (cont.)
• All logarithms grow at roughly the same rate, so in Big O notation the
base can be omitted.
• This results in a Big O of O(log n) for a binary search, which is also
known as logarithmic runtime and pronounced “on the order of log n”
or more simply “order log n.”

25
Improving Binary Search
• Interpolation search is an improved variant of binary search. This
search algorithm works on the probing position of the required value.
For this algorithm to work properly, the data collection should be in a
sorted form and equally distributed.
• Binary search has a huge advantage of time complexity over linear
search. Linear search has worst-case complexity of Ο(n) whereas
binary search has Ο(log n).
• There are cases where the location of target data may be known in
advance. For example, in case of a telephone directory, if we want to
search the telephone number of Mehran. Here, linear search and even
binary search will seem slow as we can directly jump to memory
space where the names start from 'M' are stored.

26
Interpolation Search Algorithm
• Step 1 − Start searching data from middle of the list.
• Step 2 − If it is a match, return the index of the item, and exit.
• Step 3 − If it is not a match, probe position.
• Step 4 − Divide the list using probing formula and find the new midle.
• Step 5 − If data is greater than middle, search in higher sub-list.
• Step 6 − If data is smaller than middle, search in lower sub-list.
• Step 7 − Repeat until match.

27

You might also like