Optimal File Merge Patterns
Last Updated :
11 Jul, 2025
Given n number of sorted files, the task is to find the minimum computations done to reach the Optimal Merge Pattern.
When two or more sorted files are to be merged altogether to form a single file, the minimum computations are done to reach this file are known as Optimal Merge Pattern.
If more than 2 files need to be merged then it can be done in pairs. For example, if need to merge 4 files A, B, C, D. First Merge A with B to get X1, merge X1 with C to get X2, merge X2 with D to get X3 as the output file.
If we have two files of sizes m and n, the total computation time will be m+n. Here, we use the greedy strategy by merging the two smallest size files among all the files present.
Examples:
Given 3 files with sizes 2, 3, 4 units. Find an optimal way to combine these files
Input: n = 3, size = {2, 3, 4}
Output: 14
Explanation: There are different ways to combine these files:
Method 1: Optimal method

Method 2:

Method 3:

Input: n = 6, size = {2, 3, 4, 5, 6, 7}
Output: 68
Explanation: Optimal way to combine these files

Input: n = 5, size = {5,10,20,30,30}
Output: 205
Input: n = 5, size = {8,8,8,8,8}
Output: 96
Observations:
From the above results, we may conclude that for finding the minimum cost of computation we need to have our array always sorted, i.e., add the minimum possible computation cost and remove the files from the array. We can achieve this optimally using a min-heap(priority-queue) data structure.
Approach:
Node represents a file with a given size also given nodes are greater than 2
- Add all the nodes in a priority queue (Min Heap).{pq.poll = file size}
- Initialize count = 0 // variable to store file computations.
- Repeat while (size of priority Queue is greater than 1)
- int weight = pq.poll(); pq.pop;//pq denotes priority queue, remove 1st smallest and pop(remove) it out
- weight+=pq.poll() && pq.pop(); // add the second element and then pop(remove) it out
- count +=weight;
- pq.add(weight) // add this combined cost to priority queue;
- count is the final answer
Below is the implementation of the above approach:
C++
// C++ program to implement
// Optimal File Merge Pattern
#include <bits/stdc++.h>
using namespace std;
// Function to find minimum computation
int minComputation(int size, int files[])
{
// Create a min heap
priority_queue<int, vector<int>, greater<int> > pq;
for (int i = 0; i < size; i++) {
// Add sizes to priorityQueue
pq.push(files[i]);
}
// Variable to count total Computation
int count = 0;
while (pq.size() > 1) {
// pop two smallest size element
// from the min heap
int first_smallest = pq.top();
pq.pop();
int second_smallest = pq.top();
pq.pop();
int temp = first_smallest + second_smallest;
// Add the current computations
// with the previous one's
count += temp;
// Add new combined file size
// to priority queue or min heap
pq.push(temp);
}
return count;
}
// Driver code
int main()
{
// No of files
int n = 6;
// 6 files with their sizes
int files[] = { 2, 3, 4, 5, 6, 7 };
// Total no of computations
// do be done final answer
cout << "Minimum Computations = "
<< minComputation(n, files);
return 0;
}
// This code is contributed by jaigoyal1328
Java
// Java program to implement
// Optimal File Merge Pattern
import java.util.PriorityQueue;
import java.util.Scanner;
public class OptimalMergePatterns {
// Function to find minimum computation
static int minComputation(int size, int files[])
{
// create a min heap
PriorityQueue<Integer> pq = new PriorityQueue<>();
for (int i = 0; i < size; i++) {
// add sizes to priorityQueue
pq.add(files[i]);
}
// variable to count total computations
int count = 0;
while (pq.size() > 1) {
// pop two smallest size element
// from the min heap
int temp = pq.poll() + pq.poll();
// add the current computations
// with the previous one's
count += temp;
// add new combined file size
// to priority queue or min heap
pq.add(temp);
}
return count;
}
public static void main(String[] args)
{
// no of files
int size = 6;
// 6 files with their sizes
int files[] = new int[] { 2, 3, 4, 5, 6, 7 };
// total no of computations
// do be done final answer
System.out.println("Minimum Computations = "
+ minComputation(size, files));
}
}
Python3
# Python Program to implement
# Optimal File Merge Pattern
class Heap():
# Building own implementation of Min Heap
def __init__(self):
self.h = []
def parent(self, index):
# Returns parent index for given index
if index > 0:
return (index - 1) // 2
def lchild(self, index):
# Returns left child index for given index
return (2 * index) + 1
def rchild(self, index):
# Returns right child index for given index
return (2 * index) + 2
def addItem(self, item):
# Function to add an item to heap
self.h.append(item)
if len(self.h) == 1:
# If heap has only one item no need to heapify
return
index = len(self.h) - 1
parent = self.parent(index)
# Moves the item up if it is smaller than the parent
while index > 0 and item < self.h[parent]:
self.h[index], self.h[parent] = self.h[parent], self.h[parent]
index = parent
parent = self.parent(index)
def deleteItem(self):
# Function to add an item to heap
length = len(self.h)
self.h[0], self.h[length-1] = self.h[length-1], self.h[0]
deleted = self.h.pop()
# Since root will be violating heap property
# Call moveDownHeapify() to restore heap property
self.moveDownHeapify(0)
return deleted
def moveDownHeapify(self, index):
# Function to make the items follow Heap property
# Compares the value with the children and moves item down
lc, rc = self.lchild(index), self.rchild(index)
length, smallest = len(self.h), index
if lc < length and self.h[lc] <= self.h[smallest]:
smallest = lc
if rc < length and self.h[rc] <= self.h[smallest]:
smallest = rc
if smallest != index:
# Swaps the parent node with the smaller child
self.h[smallest], self.h[index] = self.h[index], self.h[smallest]
# Recursive call to compare next subtree
self.moveDownHeapify(smallest)
def increaseItem(self, index, value):
# Increase the value of 'index' to 'value'
if value <= self.h[index]:
return
self.h[index] = value
self.moveDownHeapify(index)
class OptimalMergePattern():
def __init__(self, n, items):
self.n = n
self.items = items
self.heap = Heap()
def optimalMerge(self):
# Corner cases if list has no more than 1 item
if self.n <= 0:
return 0
if self.n == 1:
return self.items[0]
# Insert items into min heap
for _ in self.items:
self.heap.addItem(_)
count = 0
while len(self.heap.h) != 1:
tmp = self.heap.deleteItem()
count += (tmp + self.heap.h[0])
self.heap.increaseItem(0, tmp + self.heap.h[0])
return count
# Driver Code
if __name__ == '__main__':
OMP = OptimalMergePattern(6, [2, 3, 4, 5, 6, 7])
ans = OMP.optimalMerge()
print(ans)
# This code is contributed by Rajat Gupta
C#
using System;
using System.Collections.Generic;
public class OptimalMergePatterns
{
// Function to find minimum computation
static int MinComputation(int size, int[] files)
{
// create a list to store file sizes
List<int> fileList = new List<int>(files);
// variable to count total computations
int count = 0;
while (fileList.Count > 1) {
// sort the file sizes in ascending order
fileList.Sort();
// get the two smallest file sizes
int file1 = fileList[0];
int file2 = fileList[1];
// calculate the combined file size
int combinedFileSize = file1 + file2;
// add the current computations
// with the previous one's
count += combinedFileSize;
// remove the two smallest file sizes
fileList.RemoveAt(0);
fileList.RemoveAt(0);
// add new combined file size
// to the list of file sizes
fileList.Add(combinedFileSize);
}
return count;
}
public static void Main(string[] args)
{
// no of files
int size = 6;
// 6 files with their sizes
int[] files = new int[] { 2, 3, 4, 5, 6, 7 };
// total no of computations
// do be done final answer
Console.WriteLine("Minimum Computations = "
+ MinComputation(size, files));
}
}
// This code is contributed by phasing17.
JavaScript
// JavaScript program to implement
// Optimal File Merge Pattern
class Heap {
// Building own implementation of Min Heap
constructor() {
this.h = [];
}
parent(index) {
// Returns parent index for given index
if (index > 0) {
return Math.floor((index - 1) / 2);
}
}
lchild(index) {
// Returns left child index for given index
return 2 * index + 1;
}
rchild(index) {
// Returns right child index for given index
return 2 * index + 2;
}
addItem(item) {
// Function to add an item to heap
this.h.push(item);
if (this.h.length === 1) {
// If heap has only one item no need to heapify
return;
}
let index = this.h.length - 1;
let parent = this.parent(index);
// Moves the item up if it is smaller than the parent
while (index > 0 && item < this.h[parent]) {
[this.h[index], this.h[parent]] = [this.h[parent], this.h[index]];
index = parent;
parent = this.parent(index);
}
}
deleteItem() {
// Function to add an item to heap
const length = this.h.length;
[this.h[0], this.h[length - 1]] = [this.h[length - 1], this.h[0]];
const deleted = this.h.pop();
// Since root will be violating heap property
// Call moveDownHeapify() to restore heap property
this.moveDownHeapify(0);
return deleted;
}
moveDownHeapify(index) {
// Function to make the items follow Heap property
// Compares the value with the children and moves item down
const lc = this.lchild(index);
const rc = this.rchild(index);
const length = this.h.length;
let smallest = index;
if (lc < length && this.h[lc] <= this.h[smallest]) {
smallest = lc;
}
if (rc < length && this.h[rc] <= this.h[smallest]) {
smallest = rc;
}
if (smallest !== index) {
// Swaps the parent node with the smaller child
[this.h[smallest], this.h[index]] = [this.h[index], this.h[smallest]];
// Recursive call to compare next subtree
this.moveDownHeapify(smallest);
}
}
increaseItem(index, value) {
// Increase the value of 'index' to 'value'
if (value <= this.h[index]) {
return;
}
this.h[index] = value;
this.moveDownHeapify(index);
}
}
class OptimalMergePattern {
constructor(n, items) {
this.n = n;
this.items = items;
this.heap = new Heap();
}
optimalMerge() {
// Corner cases if list has no more than 1 item
if (this.n <= 0) {
return 0;
}
if (this.n === 1) {
return this.items[0];
}
// Insert items into min heap
for (const item of this.items) {
this.heap.addItem(item);
}
let count = 0;
while (this.heap.h.length !== 1) {
const tmp = this.heap.deleteItem();
count += tmp + this.heap.h[0];
this.heap.increaseItem(0, tmp + this.heap.h[0])
}
return count
}
}
// Driver Code
let OMP = new OptimalMergePattern(6, [2, 3, 4, 5, 6, 7])
let ans = OMP.optimalMerge()
console.log("Minimum Computations =", ans)
// This code is contributed by phasing17
OutputMinimum Computations = 68
Time Complexity: O(nlogn)
Auxiliary Space: O(n)
Similar Reads
Basics & Prerequisites
Data Structures
Array Data StructureIn this article, we introduce array, implementation in different popular languages, its basic operations and commonly seen problems / interview questions. An array stores items (in case of C/C++ and Java Primitive Arrays) or their references (in case of Python, JS, Java Non-Primitive) at contiguous
3 min read
String in Data StructureA string is a sequence of characters. The following facts make string an interesting data structure.Small set of elements. Unlike normal array, strings typically have smaller set of items. For example, lowercase English alphabet has only 26 characters. ASCII has only 256 characters.Strings are immut
2 min read
Hashing in Data StructureHashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The
2 min read
Linked List Data StructureA linked list is a fundamental data structure in computer science. It mainly allows efficient insertion and deletion operations compared to arrays. Like arrays, it is also used to implement other data structures like stack, queue and deque. Hereâs the comparison of Linked List vs Arrays Linked List:
2 min read
Stack Data StructureA Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first
2 min read
Queue Data StructureA Queue Data Structure is a fundamental concept in computer science used for storing and managing data in a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the queue is the first one to be removed. It is used as a buffer in computer systems
2 min read
Tree Data StructureTree Data Structure is a non-linear data structure in which a collection of elements known as nodes are connected to each other via edges such that there exists exactly one path between any two nodes. Types of TreeBinary Tree : Every node has at most two childrenTernary Tree : Every node has at most
4 min read
Graph Data StructureGraph Data Structure is a collection of nodes connected by edges. It's used to represent relationships between different entities. If you are looking for topic-wise list of problems on different topics like DFS, BFS, Topological Sort, Shortest Path, etc., please refer to Graph Algorithms. Basics of
3 min read
Trie Data StructureThe Trie data structure is a tree-like structure used for storing a dynamic set of strings. It allows for efficient retrieval and storage of keys, making it highly effective in handling large datasets. Trie supports operations such as insertion, search, deletion of keys, and prefix searches. In this
15+ min read
Algorithms
Searching AlgorithmsSearching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input
2 min read
Sorting AlgorithmsA Sorting Algorithm is used to rearrange a given array or list of elements in an order. For example, a given array [10, 20, 5, 2] becomes [2, 5, 10, 20] after sorting in increasing order and becomes [20, 10, 5, 2] after sorting in decreasing order. There exist different sorting algorithms for differ
3 min read
Introduction to RecursionThe process in which a function calls itself directly or indirectly is called recursion and the corresponding function is called a recursive function. A recursive algorithm takes one step toward solution and then recursively call itself to further move. The algorithm stops once we reach the solution
14 min read
Greedy AlgorithmsGreedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum solution. At every step of the algorithm, we make a choice that looks the best at the moment. To make the choice, we sometimes sort the array so that we can always get
3 min read
Graph AlgorithmsGraph is a non-linear data structure like tree data structure. The limitation of tree is, it can only represent hierarchical data. For situations where nodes or vertices are randomly connected with each other other, we use Graph. Example situations where we use graph data structure are, a social net
3 min read
Dynamic Programming or DPDynamic Programming is an algorithmic technique with the following properties.It is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of
3 min read
Bitwise AlgorithmsBitwise algorithms in Data Structures and Algorithms (DSA) involve manipulating individual bits of binary representations of numbers to perform operations efficiently. These algorithms utilize bitwise operators like AND, OR, XOR, NOT, Left Shift, and Right Shift.BasicsIntroduction to Bitwise Algorit
4 min read
Advanced
Segment TreeSegment Tree is a data structure that allows efficient querying and updating of intervals or segments of an array. It is particularly useful for problems involving range queries, such as finding the sum, minimum, maximum, or any other operation over a specific range of elements in an array. The tree
3 min read
Pattern SearchingPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data. Patten SearchingImportant Pattern Searching Algorithms:Naive String Matching : A Simple Algorithm that works i
2 min read
GeometryGeometry is a branch of mathematics that studies the properties, measurements, and relationships of points, lines, angles, surfaces, and solids. From basic lines and angles to complex structures, it helps us understand the world around us.Geometry for Students and BeginnersThis section covers key br
2 min read
Interview Preparation
Practice Problem