How to parallelise Cython function with OpenMP to find nearest neighbours in a boid simulation?

37 views Asked by At

I would like to parallise a function using OpenMP, but I have problems due to the reduction variable "counter", which is necessary to give my array of neighbours the correct indices. The first element of each row of the neighbour list is the number of neighbours in the row, followed by the neighbour indices of other boids within a "visual range" of the boid in question. I have tried several different methods of bypassing the need for a counter but I can't seem it get it right.

def GenerateNeighbours(x_values, y_values, num_boids): neighbour_list = np.full((num_boids, num_boids + 1), 0, dtype=int)

for boid_outer in range(num_boids):  # Use a different variable name for the outer loop
    x_vals = x_values[boid_outer]
    y_vals = y_values[boid_outer]
    counter = 1
    
    for boid_inner in range(num_boids):  # Use a different variable name for the inner loop
        distance = (x_values[boid_inner] - x_vals)**2 + (y_values[boid_inner] - y_vals)**2
        if boid_outer != boid_inner and distance < visual_range**2:
            neighbour_list[boid_outer, counter] = boid_inner
            counter += 1

    # Update the count of nearest neighbors
    neighbour_list[boid_outer, 0] = counter - 1

          
return neighbour_list 

This code compiles and runs, but it doesn't correctly index the neighbour list array and leads to terrible behaviour

cdef long[:,::1] GenerateNeighbours(double [::1 ] x_values, double [::1] y_values, int num_boids, int threads): cdef int visual_range = 50 cdef int counter cdef long[:,::1] neighbour_list cdef double x_vals, y_vals, distance cdef Py_ssize_t boid_outer, boid_inner cdef double[::1] x_values_inner = x_values cdef double[::1] y_values_inner = y_values

neighbour_list = np.full((num_boids, num_boids + 1), 0, dtype=int)

for boid_outer in prange(num_boids, nogil = True, num_threads = threads):  # Use a different variable name for the outer loop
    x_vals = x_values[boid_outer]
    y_vals = y_values[boid_outer]
    #counter = 1
    for boid_inner in range(num_boids):  # Use a different variable name for the inner loop
        
        distance = (x_values_inner[boid_inner] - x_vals)**2 + (y_values_inner[boid_inner] - y_vals)**2
        if boid_outer != boid_inner and distance < visual_range**2:
            neighbour_list[boid_outer, boid_inner+1] = boid_inner
            #counter += 1
    # Update the count of nearest neighbors
    neighbour_list[boid_outer, 0] = counter - 1
return neighbour_list 

I just need to come up with a way of matching the behaviour of the initial function that still compiles and runs using a prange loop with no gil. Any suggestions or observations would be greatly appreciated!

1

There are 1 answers

0
DavidW On

I think your problem is that counter is incorrectly being identified as a reduction variable when actually a unique counter should be used for each iteration of the outer loop (what OpenMP would call "private").

To fix that you've just got to rewrite counter += 1 so Cython no longer identifies it as a reduction variable:

counter = counter + 1

That's possible a bit opaque so one option would be to put the entire loop into a cdef and nogil function, so that the outer loop just looks like:

for boid_outer in prange(num_boids, nogil = True, num_threads = threads)
    neighbour_list[boid_outer, 0] = calculate_for_boid_outer(
        boid_outer, num_boids, neighbour_list[boid_outer, :], x_values, y_values)

(or something very similar). That makes it a bit more obvious that each loop iteration is independent.

I'm not able to test any of this because I don't think you've provided enough information to do so so it's a bit of a guess, but hopefully works.