Can Thrust::sort/unique handle NaN, Inf?

451 views Asked by At

In my CUDA Kernel:

// declaring data
float * data = new float[size];
[...]
[fill data]
[...]

// sorting
thrust::sort(data, data + size, thrust::greater<float>());

// unique
thrust::unique(thrust::device, data, data + size);

Output:

data =

0.1000
0.1000
0.1000
     0
-0.3000
-0.2000
-0.1000
-Inf
-Inf
-Inf
NaN
Inf
Inf
Inf
-Inf
-Inf  
NaN
Inf
Inf
Inf
Inf

My output, which you can here see in MATLAB is not sorted and the duplicates are not removed. UNIQUE and SORT isn't working at all. Is an pointer to array not supported for Thrust?

1

There are 1 answers

0
talonmies On

No comparison based algorithm can work correctly with data containing NaN values, because NaN is uncomparable. Inf and -Inf are comparable, and will work with thrust or C++ standard library algorithms which perform comparison.

The only solution here is to firstly remove the NaN values (thrust::remove_if can be used for this with a functor or lambda expression which uses isnan), then run comparison based algorithms on the data. So something like this:

#include <iostream>
#include <thrust/remove.h>
#include <thrust/unique.h>
#include <thrust/sort.h>

int main()
{
    const int N=18;
    unsigned int data[N] = { 
         0x3e99999a, 0x7f800000, 0xff800000, 0x7fffffff, 0x3e4ccccd, 0x3dcccccd, 
         0x3e99999a, 0x7f800000, 0xff800000, 0x7fffffff, 0x3e4ccccd, 0x3dcccccd, 
         0x3e99999a, 0x7f800000, 0xff800000, 0x7fffffff, 0x3e4ccccd, 0x3dcccccd };

    float* input = reinterpret_cast<float*>(&data[0]);

    {
        std::cout << "Input" << std::endl;
        auto it = input;
        for(; it != input+N; ++it) { std::cout << *it << std::endl; }
        std::cout << std::endl;
    }

    auto pred = [](const float& v){ return isnan(v); };
    auto input_end = thrust::remove_if(input, input+N, pred);
    thrust::sort(input, input_end);
    input_end = thrust::unique(input, input_end);

    {
        std::cout << "Output" << std::endl;
        auto it = input;
        for(; it != input_end; ++it) {std::cout << *it << std::endl; }
        std::cout << std::endl;
    }

    return 0;
}

which does the following:

~/SO$ nvcc --std=c++11 -arch=sm_52 inf_nan.cu 
~/SO$ ./a.out 
Input
0.3
inf
-inf
nan
0.2
0.1
0.3
inf
-inf
nan
0.2
0.1
0.3
inf
-inf
nan
0.2
0.1

Output
-inf
0.1
0.2
0.3
inf

is perfectly capable of sorting and extracting unique values, include Inf and -Inf on the input data once the NaN entries are removed.