I have a pair of arrays of equal size, I will call them keys and values.
For example:
K: V
1: 99
1: 100
1: 100
1: 100
1: 103
2: 103
2: 105
3: 45
3: 67
The keys are sorted and the values associated with each key are sorted. How do I remove the value duplicates associated with each key and its corresponding key?
That is, I want to compact the above to:
1: 99
1: 100
1: 103
2: 103 <-- This should remain, since key is different
2: 105
3: 45
3: 67
I looked at the stream compaction functions available in Thrust, but was not able to find anything which does this. Is this possible with Thrust? Or do I need to write my own kernel to mark the duplicates in a stencil and then remove them?
The keys are sorted and the values associated with each key are also sorted. Thus, we can consider that the key-value pairs are sorted.
thrust::unique
will work directly on this if it can see these 2 vectors as a single vector. This can be achieved by zipping up the 2 items (key-value) at each position into a single tuple usingzip_iterator
.Here is how to achieve this in-place and also trim the key-value vectors to only the unique elements:
If you want to compact and produce a separate result stream, you need to write your own binary predicate for your type which looks at both elements of the tuple. The
thrust::zip_iterator
can be used to form a virtual tuple iterator from separate arrays.A complete working example of how you might do it looks like this: