Aparapi and objects with primitive-type'd only attributes

33 views Asked by At

I'm currently in the process of writing my initial pieces of code using Aparapi.

The task involves handling a list of 3D points that represent a grid-like square 3D surface, with a grid interval set at 1mm. The end goal is to create a grid with a smaller interval of 0.25mm by interpolating the original one. Our existing parallel code, tailored for this task, is based on ManagedScheduledExecutorService within a Jakarta-EE environment, rather than using GPU threads.

To transition to a GPU-compatible structure, I've begun porting the current code. The first step involves adding the collection of points to a matrix representing the surface. Given that the indices for each element's addition have to be properly computed, I see this step as an easy entry point into the Aparapi "world.

However, while the kernel output appears correct, it falls back to the JTP mode instead of executing on the GPU. The warning message I encounter when Aparapi attempts to generate the OpenCL code is:

WARNING [com.aparapi.logLevel] (default task-1) Device failed for AbstractLensComputer$1, devices={Intel<GPU>|Java Alternative Algorithm|Java Thread Pool}: @41 ASTORE_2 Detected an non-reducable operand consumer/producer mismatch

(the kernel is defined as an anonymous class within AbstractLensComputer, as you will see in the code samples)

Let's delve into the code:

I have a class, GPUCompatibleThreeDimensionalSpaceIndex, to represent the grid-based surface. As you'll observe, it's exclusively built on primitive types, both in its attributes and methods:

package org.visiontech.commons.math.api.index;

public class GPUCompatibleThreeDimensionalSpaceIndex {
    private final int POINT_SIZE = 3;
    private final double[][][] spatialIndex;
    
    
    private final int minX;
    private final int intervalsPerMillimiterX;
    private final int maxX;
    private final int minY;
    private final int intervalsPerMillimiterY;
    private final int maxY;
    
    private final int xs;
    private final int ys;

    public GPUCompatibleThreeDimensionalSpaceIndex(
            int minX,
            int intervalsPerMillimiterX,
            int maxX,
            int minY,
            int intervalsPerMillimiterY,
            int maxY
    ) {
        this.minX = minX;
        this.intervalsPerMillimiterX = intervalsPerMillimiterX;
        this.maxX = maxX;
        this.minY = minY;
        this.intervalsPerMillimiterY = intervalsPerMillimiterY;
        this.maxY = maxY;
        
        this.xs = (this.maxX - this.minX)*this.intervalsPerMillimiterX + 1;
        this.ys = (this.maxY - this.minY)*this.intervalsPerMillimiterY + 1;
        
        this.spatialIndex = new double[this.xs][this.ys][POINT_SIZE];
    }

    public boolean insert(double[] threeDimensionalPoint) {
        int[] indexes = computeIndexes(threeDimensionalPoint);
        
        if (indexes == null) {
            return false;
        }
        this.spatialIndex[indexes[0]][indexes[1]] = threeDimensionalPoint;
        return true;
    }

    public boolean contains(double[] twoDimensionalPoint) {
        int[] indexes = computeIndexes(twoDimensionalPoint);
        if (indexes == null) {
            return false;
        }
        int xIndex = indexes[0];
        int yIndex = indexes[1];
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint[0] == twoDimensionalPoint[0] && candidatePoint[1] == twoDimensionalPoint[1];
    }

    public double[] search(double[] twoDimensionalPoint) {
        int[] indexes = computeIndexes(twoDimensionalPoint);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0];
        int yIndex = indexes[1];
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        if (candidatePoint[0] == twoDimensionalPoint[0] && candidatePoint[1] == twoDimensionalPoint[1]) {
            return candidatePoint;
        } else {
            return null;
        }
    }

    public double[] get(double[] twoDimensionalPoint) {
        return search(twoDimensionalPoint);
    }

    public double[] nearest(double[] twoDimensionalPoint) {
        int[] indexes = computeIndexes(twoDimensionalPoint, false);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0];
        int yIndex = indexes[1];
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint;
    }
    
    public double[] nextAlongX(double[] threeDimensionalPoint) {
        int[] indexes = computeIndexes(threeDimensionalPoint);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0] + 1;
        if (xIndex >= this.xs) {
            return null;
        }
        int yIndex = indexes[1];
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint;
    }

    public double[] nextAlongY(double[] threeDimensionalPoint) {
        int[] indexes = computeIndexes(threeDimensionalPoint);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0];
        int yIndex = indexes[1] + 1;
        if (yIndex >= this.ys) {
            return null;
        }
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint;
    }
    
    public double[] previousAlongX(double[] threeDimensionalPoint) {
        int[] indexes = computeIndexes(threeDimensionalPoint);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0] - 1;
        if (xIndex < 0) {
            return null;
        }
        int yIndex = indexes[1];
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint;
    }

    public double[] previousAlongY(double[] threeDimensionalPoint) {
        int[] indexes = computeIndexes(threeDimensionalPoint);
        if (indexes == null) {
            return null;
        }
        int xIndex = indexes[0];
        int yIndex = indexes[1] - 1;
        if (yIndex < 0) {
            return null;
        }
        double[] candidatePoint = this.spatialIndex[xIndex][yIndex];
        return candidatePoint;
    }
    
    private int[] computeIndexes(double[] threeDimensionalPoint) {
        return computeIndexes(threeDimensionalPoint, true);
    }

    private int[] computeIndexes(double[] threeDimensionalPoint, boolean nullify) {
        double xIndexCandidate = (threeDimensionalPoint[0] - this.minX)*this.intervalsPerMillimiterX;
        double yIndexCandidate = (threeDimensionalPoint[1] - this.minY)*this.intervalsPerMillimiterY;
        int xIndexFloor;
        if (xIndexCandidate >= 0) {
            xIndexFloor = (int) xIndexCandidate; 
        } else {
            xIndexFloor = (int) xIndexCandidate - 1;
        }
        int yIndexFloor;
        if (yIndexCandidate >= 0) {
            yIndexFloor = (int) yIndexCandidate; 
        } else {
            yIndexFloor = (int) yIndexCandidate - 1;
        }
        int xIndex;
        int yIndex;
        if (xIndexCandidate - xIndexFloor < 0.5) {
            xIndex = xIndexFloor;
        } else {
            xIndex = xIndexFloor + 1;
        }
        if (yIndexCandidate - yIndexFloor < 0.5) {
            yIndex = yIndexFloor;
        } else {
            yIndex = yIndexFloor + 1;
        }
        if (nullify) {
            if (xIndex < 0 || xIndex >= this.xs || yIndex < 0 || yIndex >= this.ys) {
                return null;
            }            
        } else {
            if (xIndex < 0) {
                xIndex = 0;
            } else if (xIndex >= this.xs) {
                xIndex = this.xs - 1;
            }
            if (yIndex < 0) {
                yIndex = 0;
            } else if (yIndex >= this.ys) {
                yIndex = this.ys - 1;
            }
        }
        int[] indexes = {xIndex, yIndex};
        return indexes;
    }
}

Additionally, here's a Java function that creates the Kernel and relies on the aforementioned class (focus on the index variable and on the index.insert() invocation):

protected GPUCompatibleThreeDimensionalSpaceIndex toGPUCompatibleIndex(Collection<ThreeDimensionalCartesianPoint> pointList, Double intervalsPerMillimiter) {
        Integer minX = Integer.MAX_VALUE;
        Integer maxX = Integer.MIN_VALUE;
        Integer minY = Integer.MAX_VALUE;
        Integer maxY = Integer.MIN_VALUE;
        for (ThreeDimensionalCartesianPoint point : pointList) {
            if (point.getX() < minX) {
                minX = point.getX().intValue();
            } else if (point.getX() > maxX) {
                maxX = point.getX().intValue();
            }
            if (point.getY() < minY) {
                minY = point.getY().intValue();
            } else if (point.getY() > maxY) {
                maxY = point.getY().intValue();
            }
        }
        final GPUCompatibleThreeDimensionalSpaceIndex index =
            new GPUCompatibleThreeDimensionalSpaceIndex(
                minX, intervalsPerMillimiter.intValue(), maxX, minY, intervalsPerMillimiter.intValue(), maxY
            );
        
        final double[] pointArray = new double[pointList.size()*3];
        Range pointArrayRange = Range.create(pointList.size());
        Iterator<ThreeDimensionalCartesianPoint> pointIterator = pointList.iterator();
        IntStream.range(0, pointList.size()).forEach(
            pointIndex -> {
                ThreeDimensionalCartesianPoint point = pointIterator.next();
                pointArray[3*pointIndex] = point.getX();
                pointArray[3*pointIndex + 1] = point.getY();
                pointArray[3*pointIndex + 2] = point.getZ();
            }
        );
        Kernel indexingKernel = new Kernel() {
            @Override
            public void run() {
                int globalId = 3*getGlobalId();
                double[] point = {pointArray[globalId], pointArray[globalId + 1], pointArray[globalId + 2]};
                index.insert(point);
            }
        };
        indexingKernel.setExplicit(true);
        indexingKernel.execute(pointArrayRange);

        return index;
    }

As you can see, all objects are appropriately transformed into primitive types, and the kernel operates solely on primitive variables—except for the index.insert() invocation. Analoguously, all the code contained in, and invoked by, index.insert() relies only on primitive types and some method invocation.

Is this approach supposed to work, or should I move all the code to the kernel, without relying on other classes? I'd like to preserve GPUCompatibleThreeDimensionalSpaceIndex since its functionalities will be needed also by future pieces of code to be run on the GPU. Is my preference compatible with Aparapi? If it isn't, would the addition of all the GPUCompatibleThreeDimensionalSpaceIndex's attributes and functions to the Kernel actually solve the issue?

0

There are 0 answers