Simultaneous access to the same pixel in a ray generation shader - is it safe?

37 views Asked by At

I have a ray generation shader which adds a random number of rgb values to a RWTexture2D<float3> output : register(u0) at random pixel locations:

[shader("raygeneration")]
void foo()
{
    for (;;)
    {
        float2 const raster_point = float2(rand(seed) * width, rand(seed) * height);
        float3 radiance;
        // trace a path starting at raster_point
        // and store the result in radiance.

        tau += /* somehow computed at random */;
        output[uint2(raster_point)] += radiance;

        if (/* exit criterion is met */)
            break;
    }
}

tau is a global float (don't know how I need to declare it and bind it to a register). Before output can be copied to the back buffer and be displayed, I need to divide each element of output by tau:

[shader("compute")]
void bar()
{
    for (uint i = 0; i < height; ++i)
    {
        for (uint j = 0; j < width; ++j)
            output[uint(i, j)] /= tau;
    }
}

Now, I have several issues with my concern:

  1. How do I set each element of output to float3(0, 0, 0) before foo is called? I thought this would be possible with ClearUnorderedAccessViewFloat, but I got trouble using that. So, do I need to use a compute shader which I execute before and which clears output appropriately?
  2. I'm calling DispatchRays with width = n and height = depth = 1. Since multiple invocations of foo could try to alter output at the same index at the same time, I was wondering if this is actually safe. Do I need to switch to some kind of atomic operation (InterlockedAdd) here? If so, would it be better to use an array RWTexture2D<float3> outputs[N] and call DispatchRays with width = N instead? Then I would only alter outputs[DispatchRaysIndex().x] in the code. I would also need to alter the definition of foo to ensure that the resulting code is still invoked n times (n != N).
  3. Do I really need to use a compute shader to get what I do with bar? I would also need to sum the outputs in (2.) if I actually should use an array of outputs.

I've tried to answer these questions for a couple of days now, but the docs do not give me any indication of what I should do and which of the described options is preferred.

0

There are 0 answers