Linear and binary searching
Linear and binary searching
1. Introduction
Every day, we sift through massive amounts of information—whether we’re searching for a file in our
computer’s folders, finding an email in our inbox, or looking up a product on an e-commerce website.
Searching is so fundamental that without efficient ways to find what we need, working with large sets
of data would be painfully slow and complicated.
In software and algorithmic design, we often talk about search algorithms—instructions for how a
computer locates a specific piece of data quickly in a collection. Two of the most common are:
Real-World Relevance:
Understanding these methods will give you a solid base for more advanced concepts in data
structures and algorithms—like sorting, graph traversal, and even machine learning techniques. Let’s
begin with the more straightforward Linear Search, then compare it to the more powerful Binary
Search.
2. Linear Search
If you imagine having a shoebox full of index cards—each card has a word, and you want to find a
specific word. You’d pick the first card, check it, then move on to the second card, and so on. This is
effectively the process of Linear Search.
In each case, you only finish searching once you’ve encountered the desired item or confirmed it isn’t
there.
2.3 Detailed Code Example (Python)
Step-by-Step Explanation:
1. Function Definition: linear_search(arr, x) indicates that we’ll pass an array (or list) arr and
the element x we’re searching for.
2. Loop: We iterate i from 0 up to (length of arr - 1).
3. Check: At each position, compare arr[i] with x .
4. Return: If they match, immediately return the index i .
5. Not Found: If the loop finishes without finding a match, return -1 .
This approach quickly shrinks the search area until the target is found or until there’s nowhere left to
search.
Important Note: Binary Search requires sorted data. If the data is not sorted, the algorithm’s logic will
fail, as looking “left” or “right” no longer reliably moves closer to or further from the target.
if arr[mid] == x:
return mid # Found the target
elif arr[mid] > x:
right = mid - 1 # Eliminate the right half
else:
left = mid + 1 # Eliminate the left half
Step-by-Step Explanation:
1. Initialization: left points to the start (index 0), and right points to the end (index len(arr)-1 ).
2. Loop Condition: While there’s a valid range ( left <= right ), keep narrowing.
3. Mid Calculation: We use (left + right) // 2 to get the middle index.
4. Comparison:
If arr[mid] is the target x , we’re done.
If arr[mid] is greater than x , we move right to mid-1 .
Otherwise, we move left to mid+1 .
5. No Match: If the loop exits, the element is not present.
3.4 Recursive Version (Python)
if arr[mid] == x:
return mid
elif arr[mid] > x:
return binary_search_recursive(arr, left, mid - 1, x)
else:
return binary_search_recursive(arr, mid + 1, right, x)
Big-O Notation Reminder: O(log n) grows very slowly compared to O(n) . For example, if
n = 1,000,000 , log2(n) is just about 20.
- Small lists
Check each - Rarely
Linear item one by one searching data
O(n) O(1) No
Search until found/not - Data easily
found accessed
sequentially
Time Space Requires Real-World
Algorithm Concept
Complexity Complexity Sorting? Use Case
- Large
Repeatedly datasets
divide the - Frequent
Binary O(1) iter.
(sorted) list in O(log n) Yes searches
Search O(log n) rec.
half for each - Scenarios
step where data is
already sorted
Additional Tips
1. Sorted Data: If you have to sort the data anyway, consider the cost of sorting (O(n log n)) plus the
repeated advantage of O(log n) lookups if you’ll search multiple times.
2. Memory Constraints: Both are typically O(1) extra memory in their iterative forms, so not a big
difference there.
3. Data Location: If your data structure is not conducive to random access (e.g., linked lists),
Binary Search loses its advantage because you can’t jump to the midpoint in constant time.
1. Q: "I have an unsorted list of size 10 million, and I plan to do hundreds of searches. Should I sort
the list first or just do linear searches for each query?"
A: Sorting once costs O(n log n). For 10 million items, that can be quite big, but if you then do
hundreds or thousands of searches using Binary Search at O(log n) each, the total might be
less than doing linear searches (which would be O(n) for every search). You have to weigh
the cost of sorting versus the sum of repeated searches.
2. Q: "How does Binary Search handle data that changes frequently (e.g., new elements constantly
being added or updated)?"
A: If the data is constantly changing (thus losing its sorted order), you’d have to re-sort
(costly!) or adopt a data structure (like a balanced binary search tree) that keeps elements
in sorted order as you insert or remove items. The standard Binary Search only works on
sorted, static (or rarely updated) lists.
3. Q: "Which algorithm is better for searching in a linked list?"
A: For a singly linked list, you don’t have direct access to the middle element in O(1) time. So
even if it’s sorted, you can’t easily jump to the midpoint. Typically, Linear Search is used
unless you have a specialized data structure (like a skip list).
1. Linear Search
Definition: Check each element in turn.
Time: O(n).
Space: O(1).
Use When: Unsorted data, small lists, minimal searching.
2. Binary Search
Definition: Divide and conquer; check the middle of a sorted list.
Time: O(log n).
Space: O(1) iterative, O(log n) recursive.
Use When: Large sorted data, repeated searches.
8. Practical Exercise
Goal: Observe how sorting once and searching many times might be beneficial compared to just
searching linearly.
If the user had answered “no,” the program would do a Linear Search directly on
[7, 2, 13, 1, 9, 5] , checking each item in sequence.
8.3 Starter Code Snippet (Python)
def search_comparison_program():
# Step 1: Gather data
arr_str = input("Enter integers separated by spaces: ")
arr = list(map(int, arr_str.split()))
if choice == 'yes':
# Sort the array
arr.sort()
print("The list after sorting:", arr)
print("Performing Binary Search...")
result = binary_search(arr, x) # Use the iterative version
else:
print("Performing Linear Search on the unsorted list...")
result = linear_search(arr, x)
Explanation:
End of Notes
You now have a comprehensive foundation in two major searching algorithms. Feel free to
experiment with your own data, track the number of comparisons or steps it takes, and see how these
algorithms perform in practice. Happy coding!