Sorting is a basic topic in computer science and it is all about putting a finite sequence in order according to some measure. The usual presentation involves either pseudo-code or actual code. I think it’s clearer just to use recursive functions on sequences – which nobody does, for some reason. For example, Quicksort is notoriously slippery in code but pretty straightforward as a map on sequences.
In this note all sequences are finite and they are either represented directly within square brackets, for example or as maps where the domain (pre-image) is the set of indexes. The empty sequence is written Nil. Sequence length is given by Length(s). Representing sequences as finite maps is unusual, but it has some advantages. For example say Concat(s,u) is a finite sequence w, where w(i)=s(i) for 0< i ≤ Length(s) and w(i) = u(i – Length(s)) otherwise. Say a sequence s is “sorted” under some measure m iff for every 0 < i < j ≤ Length(s), we have m(s(i)) ≤ m(s(j)). I am arbitrarily assuming ordered sequences are in ascending order and the “measure” can be anything as long as there is an associated total order. A measure could be number of likes on some social media or an absolute value or whatever seems appropriate for the sequence values.
Recall that a permutation of a set X is a one-to-one map of X onto X. A permutation of a finite sequence s is a finite sequence w with the same length so that for some permutation π of the set of indexes of s, for i in the set of indexes, we have s(i) = u(π(i)). The problem of sorting is the problem of finding a permutation of a sequence that is ordered by the chosen measure (there is no requirement that such a permutation is unique).
Quicksort is a famous sorting method that is interesting both at a high level and because it can be made to work efficiently (in most cases) by just swapping elements. The core operation is “partitioning” around a “pivot” value.
Quicksort partitioning produces a subsequence called the “pivot” where all elements have the same measure, and two other subsequences: one of elements where the measure is less than or equal to the measure of the elements of the pivot and one with elements that have greater measure. Concatenating these sequences products a permutation of the original that is more ordered. The quicksort algorithm recursively applies partitioning to the non-pivot subsequences until – almost magically, the whole sequence is ordered.
Say a map f is a partition map for some set of sequences and some measure m only if for every non-Nil sequence s, f(s)= (u,v,w) where (1) Concat(u,Concat(v,w)) is a permutation of s, (2) v is not empty and every element of v has the same measure, (3) the measures of the elements of u are less than or equal to the measure of the elements of v, and (4) the measures of the elements of w are greater than the measure of the elements of v.
A sequence of length 1 has a simple partition into two empty elements and the pivot. A two element sequence is ordered by partition. Suppose s=[x,y] and f(s)=(u,v,w). If m(x)=m(y) then the partitioned sequence is trivially ordered. If m(x) < m(y) then since v cannot be Nil, v=[x] and w=[y] and Concat(u,Concat(v,w)) = [x,y]. If m(x)>m(y) then u=Nil, v=[y] and w=[x] for the same result. Quicksort (QS) recursively applies partitioning until it has partitioned the original sequence into 0,1 or 2 element subsequences
- QS(s) = s if Length(s) < 2
- QS(s) = Concat( QS(f(u)), Concat(v,QS(f(w))) otherwise where f(s)=(u,v,w) or we could write QS(s) = Concat(QS( (f(s)1),Concat(f(s)2,QS(f(s)3))
Note that if u,v, and w are all ordered by m, v is not empty, and properties (3) and (4) hold, the concatenation of the three sequences must be ordered. Properties (1) and (4) above are sufficient to assure that the recursion terminates since both u and w must contain less elements than s and the sum of the the three lengths must be the same as the length of s. Suppose that QS(s) sorts s for any s of lengths n or less and then consider some s of length n+1. Because v is not empty, u and w are length n or less so, by the induction hypothesis QS(u) and QS(w) must sort u and w respectively. By (2), v is already ordered and concatenating three such ordered sequences produces an ordered sequence.
Any partitioning function will do, but Quicksort came with a clever construction of the partitioning map from repeated “swapping” of pairs of elements.
if u = swap(s,i,j) for i and j in the index set of s, then u(i)=s(j) and u(j)=s(i) and u(k)= s(k) for k not equal to i or j.
Swap(s,i,j) is a permutation of s. Quicksort partitioning is done by taking the first element of the sequence , s(1), as the pivot element. Then keep swapping until the sequence looks like Concat(v,Concat(u,w)) where v is just the single element sequence consisting of s(1) serving as the pivot sequence and u and w are the other sequences in the partition. Let L(p, s) be the index of the end of Concat(v,u). Since s(1)$ is the pivot element, L(s(1),s) >0. Let R(p,s) be the greatest index of an element with measure less than or equal to the measure of the pivot p.
- L(p,s) = max: i ≤ Length(s) s.t. forall 0 < j ≤ i , m(s(j)) ≤ m(p)
- R(p,s) = max: i ≤ Length(s) s.t. m(s(i)) ≤ m(p)$
Again, R(s(1),s) >0. It is impossible for R(s(1),s) < L(s(1),s) by construction. If R(s(1),s) > L(s(1),s) then swap the element at L(s(1),s)+1 (which has to have measure greater than the measure of the pivot) with the element at R(s(1),s). As soon as L(s(1),s) = R(s(1),s) the partitioning is done. In that case v = [s(1)], u is the sequence from s(2) till L(s(1),s) and w is what’s left.
- f(s) = s if Length(s) < 2$
- f(s) = f(swap(s, L(s(1),s)+1,R(s(1),s)) if L(s(1),s) < R(s(1),s)
- f(s) = ([s(2) … L(s(1),s)], [s(1)], [L(s(1),s)+1 … s(Length(s))] if L(s(1),s)=R(s(1),s).
Then this map f produces partitions according to the definition above.