Table of Contents
Merge sort is one of the most powerful sorting algorithms. Merge sort is widely used in various applications as well. The best part about these algorithms is that they are able to sort a given data in O(nlogn) complexity as against O(n_{2}) complexity (we will soon see how) of bubble sort and selection sort. Moreover, merge sort is of interest because it creates an excellent case study for one of the widely used techniques in Computer Science – divide and conquer.
Merge Sort Algorithm – Explanation
Given an array of length say n, we perform the following steps to sort the array:

 Divide the array into 2 parts of lengths n/2 and n – n/2 respectively (here if n is odd, we round off the value of n/2). Let us call these arrays as left half and right half respectively.
 Recursively sort the left half array and the right half array.
 Merge the left half array and right half array to get the full array sorted.
Let us take an example:
Given Array: [6, 4, 5, 1, 2, 7, 3]
First, as per step 1 above, we divide the array into 2 parts. As we can see, the following are the subarrays left half and right half:
 Left half: [6, 4, 5, 1]
 Right half: [2, 7, 3]
Then, as per step 2 above, we recursively sort the left and right halves. Here is how the sorted subarrays will look like:
 Recursively sorted left half: [1, 4, 5, 6]
 Recursively sorted right half: [2, 3, 7]
Finally, as per step 3, we will merge these 2 halves to create the final sorted array. Final merged and sorted array: [1, 2, 3, 4, 5, 6, 7]
The left and the right halves can always be sorted recursively using the same algorithm. The magic happens in creating the final merged and sorted array. So, let us understand it well using the above example.
In the above example, we are given 2 arrays [1, 4, 5, 6] and [2, 3, 7]. We are supposed to merge these 2 arrays into a single sorted array. Let us place a pointer at the head of each array. We will depict the pointer by underlining the corresponding element where the pointer points to.
Final merged array = []
Left array: [1, 4, 5, 6]
Right array: [2, 3, 7]
As can be seen, the pointer of the left array is at 1 and the pointer of the right array is at 2. We pick the smaller one and put it in the final merged array and move the corresponding pointer. After doing this, we will have the following state:
Final merged array = [1]
Left array: [4, 5, 6]
Right array: [2, 3, 7]
Here the pointers are now at 4 and 2 respectively. We again do what we did above – pick the smaller one and put it in the final merged array and move the corresponding pointer. We will get the following:
Final merged array = [1, 2]
Left array: [4, 5, 6]
Right array: [3, 7]
We repeat this again to get:
Final merged array = [1, 2, 3]
Left array: [4, 5, 6]
Right array: [7]
Continuing this exercise, we can see that we are successfully able to get the final merged array in the sorted form:
Final merged array = [1, 2, 3, 4, 5, 6, 7]
Left array: []
Right array: []
So, as can be seen, we started with an unsorted array and we were successfully able to get a sorted array. Another question that is to be answered – how were the left and the right arrays sorted? Well, we sorted them recursively using the same technique as above. For instance, consider the right array: [2, 7, 3]. To sort it, we will break it again into 2 subarrays: [2, 7] and [3]. Both these subarrays are already sorted so we can simply merge them by using the technique explained above to get the sorted array [2, 3, 7].
Take a look at the following image to understand how this same procedure is recursively applied on the subarrays:
In the above image, we’ve shown the actual subarray in black and the resultant sorted subarray in blue. Let us understand the detailed steps involved in performing a merge sort in the above array:
 [6, 4, 5, 1, 2, 7, 3] is divided into [6, 4, 5, 1] and [2, 7, 3]
 [6, 4, 5, 1] is divided into [6, 4] and [5, 1]
 [6, 4] is divided into [6] and [4]

 [6] is a single element array and so, is sorted.

 [4] is a single element array and so, is sorted.
 [6] and [4] are merged into [4, 6]
 [5, 1] is divided into [5] and [1]

 [5] is a single element array and so, is sorted.

 [1] is a single element array and so, is sorted.
 [5] and [1] are merged into [1, 5]

 [4, 6] and [1, 5] are merged into [1, 4, 5, 6]
 [2, 7, 3] is divided into [2, 7] and [3]
 [2, 7] is divided into [2] and [7]

 [2] is a single element array and so, is sorted.

 [7] is a single element array and so, is sorted.
 [2] and [7] are merged into [2, 7]
 [3] is a single element array and so, is sorted.
 [2, 7] and [3] are merged into [2, 3, 7]
 [1, 4, 5, 6] and [2, 3, 7] are merged into [1, 2, 3, 4, 5, 6, 7]
Observe one important point – we need a separate array in order to store the data of the final merged array. This means that merge sort requires additional space.
So, that’s how merge sort works. Here is an animation that explains the same.
Merge Sort Pseudo Code
Before we get into the actual code, let us take a look at the pseudo code.
function merge_sort(i, j, a, aux) { mid = (i + j) / 2 merge_sort(i, mid, a, aux) merge_sort(mid + 1, j, a, aux) pointer_left = i, pointer_right = mid + 1 for k in [i ... j] { if pointer_left points to smaller element, aux[k] = a[pointer_left] and increment pointer_left by 1 if pointer_right points to smaller element, aux[k] = a[pointer_right] and increment pointer_right by 1 } copy the contents of aux[i .. j] to a[i .. j] }
Now, let us take a look at the actual working code.
Merge Sort Program Code
Let us understand the code stepbystep:
void merge_sort(int i, int j, int a[], int aux[])
This prototype means that the merge_sort function sorts the subarray a[i .. j] using auxiliary array aux[]
.
if (j <= i) { return; }
if j <= i
, clearly, the subarray a[i .. j]
contains either 1 element (which is sorted) or no elements (which is also sorted). So, we do nothing in this case and simply return.
int mid = (i + j) / 2;
We plan to partition the array into 2 subarrays of almost equal lengths. These subarrays are a[i .. mid]
and a[mid + 1 .. j]
. Clearly, mid = (i + j) / 2
is the best here since mid is the average of i and j.
merge_sort(i, mid, a, aux); merge_sort(mid + 1, j, a, aux);
Here, we are recursively sorting a[i .. mid]
and a[mid + 1 .. j]
subarrays by calling the same merge_sort function.
Once we have these 2 sorted subarrays in place, the rest of the code simply merges the 2.
int pointer_left = i; int pointer_right = mid + 1; int k;
Here, we place pointer_left at the beginning of the left subarray a[i .. mid]
and the pointer_right at the beginning of the right subarray a[mid + 1 .. j]
.
for (k = i; k <= j; k++) { if (pointer_left == mid + 1) { aux[k] = a[pointer_right]; pointer_right++; } else if (pointer_right == j + 1) { aux[k] = a[pointer_left]; pointer_left++; } else if (a[pointer_left] < a[pointer_right]) { aux[k] = a[pointer_left]; pointer_left++; } else { aux[k] = a[pointer_right]; pointer_right++; } }
Here, we have 4 cases:

pointer_left == mid + 1:
in this case, the left subarray is finished and all of its elements have already been merged.

pointer_right == j + 1:
in this case, the right subarray is finished and all of its elements have already been merged.

a[pointer_left] < a[pointer_right]:
here, none of the 2 arrays have finished. However, the pointer_left points to a smaller element than pointer_right and so, we put that in the merged array.
 else the last case: here, none of the 2 arrays have finished. However, the pointer_right points to a smaller element than pointer_left and so, we put that in the merged array.
Finally, we copy the elements from aux[] to a[].
for (k = i; k <= j; k++) { a[k] = aux[k]; }
So, that’s how merge sort works.
Merge Sort Complexity
Complexity gives a rough idea of the time taken to execute the algorithm as a function of the size of the input. For instance, let T(n) be the time taken to perform merge sort on an array of size n.
As we can see, that T(n) comprises of 3:

 Time spent in performing merge sort on the left half. The left half is of size n/2 and so, the time spent would be nothing but T(n/2).

 Time spent in performing merge sort on the right half. The right half is of size n/2 and so, the time spent here would also be T(n/2).
 Time spent in merging the left and the right halves. As we can see, to merge the 2 halves, we place pick each element onebyone from the 2 subarrays and fill in the original array. Since there are n elements, the time taken in merging would be proportional to n. So, let us call this time as cn where c is some constant.
Total time, T(n) = T(n/2) + T(n/2) + cn
So, we have the equation as: T(n) = 2T(n/2) + cn. With some maths, this equation can be solved as
T(n) = 2T(n/2) + cn
= 2(2T(n/4) + cn/2) + cn = 2^{2}T(n/2^{2}) + 2cn
= 2(2(2T(n/8) + cn/4) + cn/2) + cn = 2^{3}T(n/2^{3}) + 3cn
…
…
The kth term of the above series is: 2^{k}T(n/2^{k}) + kcn
Put 2k = n, we have k = log_{2}n. We put this value of k in the equation above to get: T(n) = nT(1) + cnlog_{2}n
Here, T(1) and c are constants. So, we can write T(n) = An + Bnlog_{2}n. Since the term nlog_{2}n is larger than n, we can see that nlog_{2}n is the dominant term. Just to give you an idea, when n = 232, nlog_{2}n = 32 * 232, which is clearly an order of magnitude larger. So, T(n) can be written as T(n) = O(nlog_{2}n).
The complexity of bubble sort algorithm on the other hand as we saw was O(n^{2}). Clearly, merge sort is much faster than bubble sort algorithm and that’s why it is widely used in various applications and libraries.
Conclusion
Merge sort is an interesting algorithm and forms a great casestudy to understand data structures and algorithms. In order to develop strong foundations in computer science, you are advised to thoroughly understand various sorting algorithms which will help you pick up the basics.