Cet article est actuellement en cours de traduction.
Suivez nous sur facebook : https://www.facebook.com/Hybesis.Urna
Ou rejoignez la communauté : https://community.hurna.io
Pour être notifié de sa mise à disposition.

Depedencies

Merge Sort here works in-place following a bottom-up strategy. Initially, it merges sub-sequence of size one, since these are trivially sorted. Each adjacent sub-sequence (just a pair of elements initially) is merged/aggregated, into a sorted sub-sequence containing both. It then recurse up until the entire sequence got merged.

Steps

  • 1. Pick the middle of the sequence.
  • 2.a. Recurse on left side.
  • 2.b. Recurse on right side.
  • 3. Merge both sequences.

Pseudo-Code

void MergeSort(Iterator begin, Iterator end)
{
  const auto size = distance(begin, end);
  if (size < 2)
    return;

  auto pivot = begin + (size / 2);

  // Recursively break the vector into two pieces
  MergeSort(begin, pivot);
  MergeSort(pivot, end);

  // Merge the two pieces
  Merge(begin, pivot, end);
}

Reminder

The in-place version of a Aggregate/Merge has a time complexity of:

  • O(n/2) ~ O(n) in best case here (m = n/2)
  • O(n2/2) ~ O(n2) in worst case here (m = n/2)
  • A space compllexity of O(1) in all case

Master Theorem for Divide and Conquer Recurrences

$$T(n) = O(n) + 2T(\frac{n}{2}) --> O(n log n)$$

Time

Best

The best configuration occurs when all elements are already sorted or nealy sorted.
This means that each Merge/Aggregation will behave as a O(n) algorithm. Thus, let's put this into a recurrence relation: $$T(n) = O(n) + T(\frac{n}{2}) + T(\frac{n}{2})$$ $$T(n) = O(n) + 2T(\frac{n}{2})$$ Using the master theorem for divide-and-conquer recurrences: O(n) = O(n log n).

Average / Worst

The worst case O(n2) occurs when all elements of the first sequence are greater than all elements of the second one during mergings (cf. aggregate in place).
Putting this into a recurrence relation: $$T(n) = O(n) + T(\frac{n^2}{2}) + T(\frac{n^2}{2})$$ $$T(n) = O(n) + 2T(\frac{n^2}{2})$$

Using the master theorem for divide-and-conquer recurrences: O(n) = O(n2).

Space

As a reminder:
- In-place aggregation requires O(1) space.
- Recursion are made one after on other. This is called a tail recursion, meaning the right operates after the left has fininshed recursion. It implies that the right computation does not add to the left computation call stack.

Several attempts have been made at producing an in-place merge algorithm that can be combined with a standard (top-down or bottom-up) merge sort to produce an in-place merge sort. In this case, the notion of "in-place" can be relaxed to mean "taking logarithmic stack space", because standard merge sort requires that amount of space for its own stack usage.

Parallelization

Due to it being a bottom-up divide and conquer algorithm the work can be easily split up and worked on using different threads, a very useful trait for high performance systems with large amounts of data.

The paralellisme process is as simple as forking both merge sort recursion and joining both at the merge/aggregate process:
void MergeSort(Iterator begin, Iterator end)
{
  const auto size = distance(begin, end);
  if (size < 2)
    return;

  // Fork both sequence
  thread thread1 = MergeSort(begin, pivot);
  thread thread2 = MergeSort(pivot, end);

  // Join them
  thread1.join();
  thread2.join();

  // Merge the two pieces
  Merge(begin, pivot, end);
}