0% found this document useful (0 votes)
13 views

Data Structure Unit-I Chapter-IV

The document discusses various searching algorithms including linear search, binary search, and searching a sorted sequence. Linear search iterates through a sequence sequentially to find a target value. Binary search divides the search space in half on each step to more quickly find the target value in a sorted sequence. Searching a sorted sequence allows early termination of linear search if a larger value than the target is found.

Uploaded by

Tanvi Sawant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Data Structure Unit-I Chapter-IV

The document discusses various searching algorithms including linear search, binary search, and searching a sorted sequence. Linear search iterates through a sequence sequentially to find a target value. Binary search divides the search space in half on each step to more quickly find the target value in a sorted sequence. Searching a sorted sequence allows early termination of linear search if a larger value than the target is found.

Uploaded by

Tanvi Sawant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof.

Mala Mishra

CHAPTER -4 SEARCHING AND SORTING

SEARCHING
• Searching is the process of selecting particular information from a collection of data based
on specific criteria. You are familiar with this concept from your experience in performing
web searches to locate pages containing certain words or phrases or when looking up a
phone number in the telephone book. In this text, we restrict the term searching to refer to
the process of finding a specific item in a collection of data items.
• The search operation can be performed on many different data structures. The sequence
search, which is the focus in this chapter, involves finding an item within a sequence using
a search key to identify the specific item. A key is a unique value used to identify the data
elements of a collection. In collections containing simple types such as integers or reals,
the values themselves are the keys.
• For collections of complex types, a specific data component has to be identified as the key.
In some instances, a key may consist of multiple components, which is also known as a
compound key.

T HE L INEAR SEARCH
• We want to write a algorithm to get the computer to search through a list number (like a
list of an airline flight member).
• The list is ordered from smallest to biggest the easiest way to find out numbers is to start
at the beginning and compare our numbers (which we will call the target) to each number
in the list.
• If we reach our target, then we are done. This method of searching is called as Linear
Searching.
Here is the algorithm:

i. Start with the first item in a list.

ii. Compare the current item to the target.

iii. If the current value matches the target then we declare victory and stop.

iv. If current value is less then the target then set the current item to be the next
item and repeat from 2.

• The simplest solution to the sequence search problem is the sequential or linear search
algorithm. This technique iterates over the sequence, one item at a time, until the specific
item is found or all items have been examined. In Python, a target item can be found in
a sequence using the in operator:

if key in theArray :
print(“The key is in the array." )

1
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

ELSE :
print("The key is not in the array." )

• The use of the in operator makes our code simple and easy to read but it hides the inner
workings. Underneath, the in operator is implemented as a linear search. Consider the
unsorted 1-D array of integer values shown in Figure 5.1(a).
• To determine if value 31 is in the array, the search begins with the value in the first element.
Since the first element does not contain the target value, the next element in sequential
order is compared to value 31. This process is repeated until the item is found in the sixth
position. What if the item is not in the array? For example, suppose we want to search for
value 8 in the sample array. The search begins at the first entry as before, but this time
every item in the array is compared to the target value. It cannot be determined that the
value is not in the sequence until the entire array has been traversed, as illustrated in Figure
5.1(b).

Finding a Specific Item:


• The function in Listing 5.1 implements the sequential search algorithm, which results in a
Boolean value indicating success or failure of the search.
• This is the same operation performed by the Python in operator.
• A count-controlled loop is used to traverse through the sequence during which each element
is compared against the target value. If the item is in the sequence, the loop is terminated and
True is returned. Otherwise, a full traversal is performed and False is returned after the loop
terminates.

2
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

S EARCHING A S ORTED S EQUENCE


• A linear search can also be performed on a sorted sequence, which is a sequence containing
values in a specific order.
• For example, the values in the array illustrated in Figure 5.2 are in ascending or increasing
numerical order. That is, each value in the array is larger than its predecessor.

• A linear search on a sorted sequence works in the same fashion as that for the unsorted
sequence, with one exception. It's possible to terminate the search early when the value is
not in the sequence instead of always having to perform a complete traversal.
• For example, suppose we want to search for 8 in the array from Figure 5.2. When the fourth
item, which is value 10, is examined, we know value 8 cannot be in the sorted sequence or
it would come before 10. The implementation of a linear search on a sorted sequence is
shown in Listing 5.2 on the next page.
• The only modification to the earlier version is the inclusion of a test to deter- mine if the
current item within the sequence is larger than the target value.
• If a larger value is encountered, the loop terminates and False is returned. With the
modification to the linear search algorithm, we have produced a better version, but the
time-complexity remains the same. The reason is that the worst case occurs when the value
is not in the sequence and is larger than the last element. In this case, we must still traverse
the entire sequence of n items.

3
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

FINDING THE S MALLEST V ALUE


• Instead of searching for a specific value in an unsorted sequence, suppose we
wanted to search for the smallest value, which is equivalent to applying Python's
min() function to the sequence.
• A linear search is performed as before, but this time we must keep track of the
smallest value found for each iteration through the loop, as illustrated in Listing
5.3.
• To prime the loop, we assume the first value in the sequence is the smallest and start the
comparisons at the second item. Since the smallest value can occur anywhere in the
sequence, we must always perform a complete traversal, resulting in a worst case time of
O(n).

THE BINARY SEARCH


• The linear search algorithm for a sorted sequence produced a slight
improvement over the linear search with an unsorted sequence, but both have
a linear time- complexity in the worst case. To improve the search time for a
sorted sequence, we can modify the search technique itself.

• Example think about how you look for a name in the phone book. You open it
up somewhere in the middle and see if you need to go forward and backward.

4
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

• We need to give the computer a more precise view of instruction than this so
we will tell it to divide the list exactly in the half and compare the middle item
to our target.
• If the middle item is smaller than our target item then we can look top half of
the list, if it is bigger than our target then we will tell computer to look into the
bottom half of the list.
• We call this kind ok search Binary Searching because we always divide the
number of items we will search in half.
Here is the algorithm:

i. Set the list to be the whole list.

ii. Find the middle value of the list.

iii. If the middle value is equal to the target then we declare victory and stop.

iv. If the middle item is less than the target, then we set the new list to be the upper half
of the old list and we repeat from step 2 using the new list.

v. If the middle item is greater than the target, then we set the new list to be the bottom
half of the old list and we repeat from step 2 using the new list.

• The binary search algorithm works in a similar fashion to the process described above and
can be applied to a sorted sequence. The algorithm starts by examining the middle item of
the sorted sequence, resulting in one of three possible conditions:

• the middle item is the target value, the target value is less than the middle item, or the target
is larger than the middle item. Since the sequence is ordered, we can eliminate half the
values in the list when the target value is not found at the middle position.

• Consider the task of searching for value 10 in the sorted array from Figure 5.2.
We first determine which element contains the middle entry. As illustrated in
Figure 5.3, the middle entry contains 18, which is greater than our target of 10.

• Thus, we can discard the upper half of the array from consideration since 10 cannot
possibly be in that part. Having eliminated the upper half, we repeat the process on the
lower half of the array. We then find the middle item of the lower half and compare its
value to the target. Since that entry, which contains 5, is less than the target, we can
eliminate the lower fourth of the array. The process is repeated on the remaining items.
Upon finding value 10 in the middle entry from among those remaining, the process
terminates successfully. If we had not found the target, the process would continue until
either the target value was found or we had eliminated all values from consideration.

5
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

I MPLEMENTATION
• The Python implementation of the binary search algorithm is provided in Listing 5.4. The
variables low and high are used to mark the range of elements in the sequence currently
under consideration.

• When the search begins, this range is the entire sequence since the target item can be
anywhere within the sequence. The first step in each iteration is to determine the midpoint
of the sequence.

• If the sequence contains an even number of elements, the mid point will be chosen such
that the left sequence contains one less item than the right. Figure 5.4 illustrates the
positioning of the low, high, and mid markers as the algorithm progresses.

R UN T IME A NALYSIS

6
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

• To evaluate the efficiency of the binary search algorithm, assume the sorted sequence contains
n items. We need to determine the maximum number of times the while loop is executed.
• The worst case occurs when the target value is not in the sequence, the same as for the linear
search. The difference with the binary search is that not every item in the sequence has to be
examined before determining the target is not in the sequence, even in the worst case. Since
the sequence is sorted, each iteration of the loop can eliminate from consideration half of the
remaining values.
• As we saw earlier in Section 4.1.2, when the input size is repeatedly reduced by half during
each iteration of a loop, there will be log n iterations in the worst case. Thus, the binary search
algorithm has a worst case time-complexity of O(log n), which is more efficient than the linear
search.

7
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

Sorting
• Sorting is the process of arranging or ordering a collection of items such that each item
and its successor satisfy a prescribed relationship. The items can be simple values,
such as integers and reals, or more complex types, such as student records or
dictionary entries.
• In either case, the ordering of the items is based on the value of a sort key. The key is
the value itself when sorting simple types or it can be a specific component or a
combination of components when sorting complex types.
• We encounter many examples of sorting in everyday life. Consider the listings of a
phone book, the definitions in a dictionary, or the terms in an index, all of which
are organized in alphabetical order to make finding an entry much easier.
• Sorting is one of the most studied problems in computer science and extensive research
has been done in this area, resulting in many different algorithms.
• While Python provides a sort() method for sorting a list, it cannot be used with an array
or other data structures. In addition, exploring the techniques used by some of the
sorting algorithms for improving the efficiency of the sort problem may provide ideas
that can be used with other types of problems.
• In this section, we present three basic sorting algorithms, all of which can be applied
to data stored in a mutable sequence such as an array or list.

Bubble Sort:
• A simple solution to the sorting problem is the bubble sort algorithm, which re-
arranges the values by iterating over the list multiple times, causing larger values to
bubble to the top or end of the list.

• Bubble sort belongs to 0(n2) sorting algorithm.

• Bubble sort is stable and adaptive.

Algorithm:
o Compare each pair of adjacent elements from the beginning of the array and, if
they are in reversed order, swap them.

o If at least one swap has been done, repeat step1

• You can imagine that on every step big bubbles float on the surface and stay there. At
the step, when no bubble moves sorting stops. Let us see an example of sorting an array
to make the idea of bubble sort clearer.

8
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

• Bobble sort is a simple sorting algorithm. This sorting algorithm is comparison-based


algorithm in which each pair of adjacent element is compared and the element are
swapped if they are not in order.

• This algorithm is not suitable for large data sets as its average and worst case
complexity are of O(n2) time so we’re keeping it short and precise.

9
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

HOW BUBBLE SORT WORKS?


We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short
and precise.

Bubble sort starts with very first two elements, comparing them to check which one is greater.

In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33
with 27.

We find that 27 is smaller than 33 and these two values must be swapped.

The new array should look like this −

Next we compare 33 and 35. We find that both are in already sorted positions.

Then we move to the next two values, 35 and 10.

We know then that 10 is smaller 35. Hence they are not sorted.

10
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

We swap these values. We find that we have reached the end of the array. After one iteration, the
array should look like this −

To be precise, we are now showing how an array should look like after each iteration. After the
second iteration, it should look like this −

Notice that after each iteration, at least one value moves at the end.

And when there's no swap required, bubble sorts learns that an array is completely sorted.

Efficiency Analysis:
• The efficiency of the bubble sort algorithm only depends on the number of keys in the
array and is independent of the specific values and the initial arrangement of those values.

• To determine the efficiency, we must determine the total number of iterations performed
by the inner loop for a sequence containing n values. The outer loop is executed n-1 times
since the algorithm makes n-1 passes over the sequence.

• The number of iterations for the inner loop is not fixed, but depends on the current iteration
of the outer loop. On the first pass over the sequence, the inner loop executes n - 1 times;
on the second pass, n - 2 times; on the third, n -3 times, and so on until it executes once on
the last pass.

11
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

• The total number of iterations for the inner loop will be the sum of the first n - 1 integers,
which equals resulting in a run time of O(n2). Bubble sort is considered one of the most
in-efficient sorting algorithms due to the total number of swaps required.

Example:

1.

2.

Selection sort
• Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-
based algorithm in which the list is divided into two parts, the sorted part at the left end and
the unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the
entire list.
• The smallest element is selected from the unsorted array and swapped with the leftmost
element, and that element becomes a part of the sorted array. This process continues moving
unsorted array boundary by one element to the right.
• This algorithm is not suitable for large data sets as its average and worst case complexities are
of Ο(n2), where n is the number of items.

HOW SELECTION SORT WORKS?


Consider the following depicted array as an example.

12
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

For the first position in the sorted list, the whole list is scanned sequentially. The first position
where 14 is stored presently, we search the whole list and find that 10 is the lowest value.

So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the
list, appears in the first position of the sorted list.

For the second position, where 33 is residing, we start scanning the rest of the list in a linear
manner.

We find that 14 is the second lowest value in the list and it should appear at the second place. We
swap these values.

After two iterations, two least values are positioned at the beginning in a sorted manner.

The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −

13
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

Implementation of selection sort algorithm: -


• The selection sort, which makes n-1 passes over the array to reposition n-1 values, is
also O(n2). The difference between the selection and bubble sorts is that the selection
sort reduces the number of swaps required to sort the list to O(n).

14
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

Insertion Sort:
• This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which
is always sorted. For example, the lower part of an array is maintained to be sorted. An element
which is to be 'insert'ed in this sorted sub-list, has to find its appropriate place and then it has
to be inserted there. Hence the name, insertion sort.
• The array is searched sequentially and unsorted items are moved and inserted into the sorted
sub-list (in the same array). This algorithm is not suitable for large data sets as its average and
worst case complexity are of Ο(n2), where n is the number of items.
Algorithm
Now we have a bigger picture of how this sorting technique works, so we can derive simple steps
by which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

HOW INSERTION SORT WORKS?


We take an unsorted array for our example.

15
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the
sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list
remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

16
FYCS SEM-II PAPER-IV [DATA STRUCTURE] Prof. Mala Mishra

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.

We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.

This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall
see some programming aspects of insertion sort.
Implementation of Insertion sort algorithm:

17

You might also like