0% found this document useful (0 votes)
19 views5 pages

Interpolation&External

External sorting is required when the data being sorted is too large to fit in main memory. It uses a hybrid sort-merge strategy where chunks small enough to fit in memory are read, sorted, and written to a temporary file. These sorted sub-files are then merged into a single larger sorted file. For example, the external merge sort algorithm sorts chunks in RAM then merges the sorted chunks together into successively bigger runs until the file is fully sorted.

Uploaded by

Aastha Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views5 pages

Interpolation&External

External sorting is required when the data being sorted is too large to fit in main memory. It uses a hybrid sort-merge strategy where chunks small enough to fit in memory are read, sorted, and written to a temporary file. These sorted sub-files are then merged into a single larger sorted file. For example, the external merge sort algorithm sorts chunks in RAM then merges the sorted chunks together into successively bigger runs until the file is fully sorted.

Uploaded by

Aastha Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Interpolation Search




Given a sorted array of n uniformly distributed values arr[], write a function to search
for a particular element x in the array.
Linear Search finds the element in O(n) time, Jump Search takes O(? n) time
and Binary Search takes O(log n) time.
The Interpolation Search is an improvement over Binary Search for instances, where
the values in a sorted array are uniformly distributed. Interpolation constructs new
data points within the range of a discrete set of known data points. Binary Search
always goes to the middle element to check. On the other hand, interpolation search
may go to different locations according to the value of the key being searched. For
example, if the value of the key is closer to the last element, interpolation search is
likely to start search toward the end side.
To find the position to be searched, it uses the following formula.
// The idea of formula is to return higher value of pos
// when element to be searched is closer to arr[hi]. And
// smaller value when closer to arr[lo]
arr[] ==> Array where elements need to be searched
x ==> Element to be searched
lo ==> Starting index in arr[]
hi ==> Ending index in arr[]

There are many different interpolation methods and one such is known as linear
interpolation. Linear interpolation takes two data points which we assume as (x1,y1)
and (x2,y2) and the formula is : at point(x,y).
This algorithm works in a way we search for a word in a dictionary. The interpolation
search algorithm improves the binary search algorithm. The formula for finding a
value is: K = data-low/high-low.

K is a constant which is used to narrow the search space. In the case of binary search,
the value for this constant is: K=(low+high)/2.

The formula for pos can be derived as follows.


Let's assume that the elements of the array are linearly
distributed.
General equation of line : y = m*x + c.
y is the value in the array and x is its index.

Now putting value of lo,hi and x in the equation


arr[hi] = m*hi+c ----(1)
arr[lo] = m*lo+c ----(2)
x = m*pos + c ----(3)

m = (arr[hi] - arr[lo] )/ (hi - lo)

subtracting eqxn (2) from (3)


x - arr[lo] = m * (pos - lo)
lo + (x - arr[lo])/m = pos
pos = lo + (x - arr[lo]) *(hi - lo)/(arr[hi] - arr[lo])
Algorithm
The rest of the Interpolation algorithm is the same except for the above partition
logic.
 Step1: In a loop, calculate the value of “pos” using the probe position
formula.
 Step2: If it is a match, return the index of the item, and exit.
 Step3: If the item is less than arr[pos], calculate the probe position of the
left sub-array. Otherwise, calculate the same in the right sub-array.
 Step4: Repeat until a match is found or the sub-array reduces to zero.

# Python3 program to implement


# interpolation search
# with recursion

# If x is present in arr[0..n-1], then


# returns index of it, else returns -1.

def interpolationSearch(arr, lo, hi, x):

# Since array is sorted, an element present


# in array must be in range defined by corner
if (lo <= hi and x >= arr[lo] and x <= arr[hi]):
# Probing the position with keeping
# uniform distribution in mind.
pos = lo + ((hi - lo) // (arr[hi] - arr[lo]) *
(x - arr[lo]))

# Condition of target found


if arr[pos] == x:
return pos

# If x is larger, x is in right subarray


if arr[pos] < x:
return interpolationSearch(arr, pos + 1,
hi, x)

# If x is smaller, x is in left subarray


if arr[pos] > x:
return interpolationSearch(arr, lo,
pos - 1, x)
return -1

# Driver code

# Array of items in which


# search will be conducted
arr = [10, 12, 13, 16, 18, 19, 20,
21, 22, 23, 24, 33, 35, 42, 47]
n = len(arr)

# Element to be searched
x = 18
index = interpolationSearch(arr, 0, n - 1, x)

if index != -1:
print("Element found at index", index)
else:
print("Element not found")

# This code is contributed by Hardik Jain

Output
Element found at index 4
Time Complexity: O(log2(log2 n)) for the average case, and O(n) for the worst case
Auxiliary Space Complexity: O(1)
External Sorting


External sorting is a term for a class of sorting algorithms that can handle massive
amounts of data. External sorting is required when the data being sorted does not fit
into the main memory of a computing device (usually RAM) and instead, must reside
in the slower external memory (usually a hard drive).
External sorting typically uses a hybrid sort-merge strategy. In the sorting phase,
chunks of data small enough to fit in the main memory are read, sorted, and written
out to a temporary file. In the merge phase, the sorted sub-files are combined into a
single larger file.
Example:
The external merge sort algorithm, which sorts chunks that each fit in RAM, then
merges the sorted chunks together. We first divide the file into runs such that the size
of a run is small enough to fit into the main memory. Then sort each run in the main
memory using the merge sort sorting algorithm. Finally merge the resulting runs
together into successively bigger runs, until the file is sorted.
When We do External Sorting?
 When the unsorted data is too large to perform sorting in computer internal
memory then we use external sorting.
 In external sorting we use the secondary device. in a secondary storage
device, we use the tape disk array.
 when data is large like in merge sort and quick sort.
 Quick Sort: best average runtime.
 Merge Sort: Best Worse case time.
 To perform sort-merge, join operation on data.
 To perform order by the query.
 To select duplicate element.
 Where we need to take large input from the user.

Examples:
1. Merge sort
2. Tape sort
3. Polyphase sort
4. External radix
5. External merge
Prerequisites: MergeSort, Merge K Sorted Arrays:
Input:
input_file: Name of input file. input.txt
output_file: Name of output file, output.txt
run_size: Size of a run (can fit in RAM)
num_ways: Number of runs to be merged
To solve the problem follow the below idea:
The idea is straightforward, All the elements cannot be sorted at once as the size is
very large. So the data is divided into chunks and then sorted using merge sort. The
sorted data is then dumped into files. As such a huge amount of data cannot be
handled altogether. Now After sorting the individual chunks. Sort the whole array by
using the idea of merging k sorted arrays.
Follow the below steps to solve the problem:
 Read input_file such that at most ‘run_size’ elements are read at a time. Do
following for the every run read in an array.
 Sort the run using MergeSort.
 Store the sorted array in a file. Let’s say ‘i’ for ith file.
 Merge the sorted files using the approach discussed merge k sorted arrays

You might also like