A fast, rank based Radix Sort for floats?

1k views Asked by At

I'm searching for a fast stable radix sort implementation (with support of floats) which returns the indices of the sorted order instead of the sorted values.

Pierre Terdiman's version from his article "Radix Sort Revisited" does exactly what I want however it is more than 13 years old, and not suitable for modern pipelined CPUs.

Michael Herf's RadixSort11 from "Radix Tricks" is pretty fast, the only problem that it returns the sorted values instead of the indices, furthermore it corrupts the values of the input array.

Any help would be appreciated.

3

There are 3 answers

0
Ben Voigt On BEST ANSWER

You can either

  1. Expand each item to include its original index (this could be done during the first counting pass). Of course, the index digits are ignored for sorting purposes.

  2. Store indices into buckets instead of values. Lookup the value each time the digits are required.

The first takes more space but has better locality of reference, the second saves space.

3
brian beuning On

It is fairly straight forward to make any sort index based. Any sort is a series of comparisons and swaps, so do this.

// data to be sorted is in data[ 0 .. n ]
int index[ n + 1 ];
for( int i = 0; i <= n; i++ ) index[i] = i;
// To compare data, compare data[index[j]] < data[index[k]]
// To swap values, swap index[j] <=> index[k]
0
iavr On

I am not familiar with those implementations, but here is the inner function in one of my implementations, for integers only:

//-------------------------------------------------------------------------------------
//! sort the source array based on b-th byte and store the result in destination array
//! and keep index (how to go from the sorted array to the un-sorted)

template<typename T, typename SS, typename SD> inline
void radix_sort_byte(size_t b, array<T, SS>& src, array<T, SD>& dest,
             size_array& ind_src, size_array& ind_dest)
{
    b *= 8;
    size_t B = 256, N = src.size();

    size_array bytes = (src >> b) & 0xff;  // current byte of each element
    size_array count(B, size_t(0));  // occurrences of each element
    ++count[bytes];

    if(count[0] == N)  // all bytes are zero; order remains unchanged
        { dest = src; ind_dest = ind_src; return; }

    size_array index = shift(cumsum(count), 1);  // index-list for each element
    size_array pos(N);  // position of each element in the destination array
    for(size_t i=0; i<N; i++) pos[i] = index[bytes[i]]++;

    dest[pos] = src;  // place elements in the destination array
    ind_dest[pos] = ind_src;  // place indices
}

It is not directly readable because it uses lots of auxiliary structures and functions, but the idea is that you keep a separate array with the indices. Once you have the position of elements in the destination array (pos), the last two lines update the value array and index array in exactly the same way.

I guess you can apply the same idea to any implementation, but you'd have to modify the code.