I have a data structure already running on CUDA and collect the data as below:
struct SearchDataOnDevice
{
size_t npair;
int * id1;
int * id2;
};
I'd like to remove the duplicated id pair w/ and w/o an option called same_id_src, when same_id_src is true, <0, 5> and <5, 0> are duplicated and <5, 0> should be removed. when same_id_src is false, both pairs should be kept.
I am new to CUDA and the Thrust library, can anyone help with a quick hint?
Here is one possible approach:
thrust::sort)thrust::transform)thrust::copy_ifto perform stream compaction on the sorted pairs to produce the deduplicated resultThe need to handle the cases where e.g.
<0 5>and<5 0>are considered "identical" or not, is handled via modification to the sort functor, as well as modification to the transform functor. In the sort functor case, we reorder, for comparison purposes, each pair such that the lower ID appears first in the pair. We must arrange the sort functor carefully, so that the case of<0 5>is chosen preferentially over the case of<5 0>, when the special condition is true.Here is an example:
When we don't specify a command-line argument, the special case is considered to be true, and additional "duplicates" are removed. When we do specify a command-line argument, the special case is considered to be false.
EDIT: Working off a comment from paleonix below, we can improve the above implementation by replacing steps 2 and 3 with a call to
thrust::unique_copy, which is also a stream compaction operation. The sort process remains unchanged, and only a slight change is made to our previous transform functor, to make it usable for theunique_copyoperation: