I'm searching for a fast stable radix sort implementation (with support of floats) which returns the indices of the sorted order instead of the sorted values.
Pierre Terdiman's version from his article "Radix Sort Revisited" does exactly what I want however it is more than 13 years old, and not suitable for modern pipelined CPUs.
Michael Herf's RadixSort11 from "Radix Tricks" is pretty fast, the only problem that it returns the sorted values instead of the indices, furthermore it corrupts the values of the input array.
Any help would be appreciated.
You can either
Expand each item to include its original index (this could be done during the first counting pass). Of course, the index digits are ignored for sorting purposes.
Store indices into buckets instead of values. Lookup the value each time the digits are required.
The first takes more space but has better locality of reference, the second saves space.