# EPI 5.1 | The Dutch National Flag Problem

Reference: EPI 5.1

Description:

The quicksort algorithm would have a bad performance when the array contains many duplicates because the subarrays may differ greatly in size.

One solution is reordering or shuffling the array before sorting. Another solution is known as Dutch national flag partitioning since the Dutch national flag consists of three horizontal bands (red, white, blue).

Let say an example [0, 1, 2, 0, 2, 1, 1], and the pivot index is $3$ (A[3] = 0), so the correct partitioning is [0, 0, 1, 2, 2, 1, 1]. The elements larger than $0$ should be after $0$, but the ordering does not matter.

If the pivot index is $1$ (A[1] = 1), so the correct partitioning is [0, 0, 1, 1, 1, 2, 2].

We have three groups: smaller, equal, larger. To simply the problem we have a enum class as follows:

We use RED to denote elements in the smaller group. They could be duplicates or different elements, but that is okay as long as they are less than the WHITE elements in the equal group.

## Problem

Write a problem that takes an array A and an index i into A, and rearranges the elements such that all elements less than A[i] (the “pivot”) appear first, followed by elements equal to the pivot, followed by elements greater than the pivot.

Note: The problem is trivial to solve with $O(N)$ extra space. We can put elements into three arrays and then merge them.

## Analysis

### Brute-Force

We can avoid the extra space. In the first pass, we iterate through the array and seek an element smaller than the pivot. As soon as we find it, we move it to the subarray of smaller elements via swapping. In the second pass, we do the same thing but moving the larger element to the subarray of larger elements.

Note: In the second pass, we can stop when hitting an element smaller than the pivot, because smaller elements should all lie on the left part of the array.

Time: $O(N^2)$
Space: $O(1)$

The worst case occurs when the pivot is in the middle and all elements on its left are larger and the elements on its right are smaller, for example [5, 6, 9, 7, <5>, 3, 2, 4, 1]. As for the first element (i = 0), j has to go through half of the array, which takes $O(N/2)$ time, to find element 3.

### Two Passes

We can improve the time complexity by making a single pass to move smaller elements to the beginning and by making another pass starting from the end to move larger elements to the end.

Alternatively, in the second pass, we can instead move elements which are equal to the pivot, so we don’t need an extra variable to store the current index.

Note: The element in the original pivotIndex may change when the pivot element is swapped. It does not affect the result since we have backed up the pivot at the beginning of the program, so the value of pivot won’t change.

Time: $O(N)$
Space: $O(1)$

### One Pass

Similar to the two-pass approach, the difference is that it performs classification into elements less than, equal to, and greater than the pivot in a single pass. We can do this by maintaining four subarrays: smaller, equal, unclassified, larger. In the code, we use three pointers smaller, equal, larger to denote the elements in the four subarrays.

The explanation is in the comments. Notice the equal always points to the incoming unclassified element. And the elements before smaller and the elements after larger are finalized.

Example:

Note: This is my solution based on larger = A.size() - 1 and while (equal <= larger). The solution in the book is based on larger = A.size() and while (equal < larger). They are equivalent.

Time: $O(N)$ since each time at least one of the pointers is moved and each subarray is either expended by one or shrunk by one.
Space: $O(1)$

## Variant 1

Bring duplicate elements all together, but the ordering is not required.

Time: $O(N)$
Space: $O(1)$

Comment
Junhao Wang
a software engineering cat