Java Concurrent - Thread synchronization in Fork/Join - LU decomposition

1k views Asked by At

During developing a Java software, I asked here Java Concurrent - no speedUp gained LU algorithm - False sharing? why I don't have no speed up parallelizing this code using CyclicBarrier.

       public void decompose(){
int n = A.length;
for(int k=0; k<n-1;k++) {
    for(int i=k+1; i<n; i++) {
        A[i][k] = A[i][k]/A[k][k];
        for(int j=k+1; j<n; j++) {
            A[i][j] = A[i][j] - A[i][k] * A[k][j];
        }
    }
}
decomposed = true;
}  

The algorithm basically do the gaussian reduction of the matrix

After some discussion (if you interested just see the comments), an user (brettw) replied with this solution using the Fork/Join Java Framework:

    public void decompose()
{
 final Semaphore semaphore = new Semaphore(0);

class Decompose extends RecursiveAction {
    private final int k;

    Decompose(int k) {
        this.k = k;
    }

    protected void compute() {
        final int n = A.length;
        for (int i = k + 1; i < n; i++) {
            A[i][k] = A[i][k] / A[k][k];
            for (int j = k + 1; j < n; j++) {
                A[i][j] = A[i][j] - A[i][k] * A[k][j];
            }
        }

        semaphore.release();
    }
}

ForkJoinPool mainPool = new ForkJoinPool();
for (int k = 0; k < A.length - 1; k++) {
    mainPool.execute(new Decompose(k));
}
semaphore.acquireUninterruptibly(A.length - 1);
}

The problem is that this algorithm doesn't produce the expected result since there is no synchronization with the worker (every row have to be update the all the value for increment the k value).

My question is:

What kind of synchronization strategy you would suggest since I cannot foresee the number of threads/workers?

1

There are 1 answers

1
edharned On BEST ANSWER

You are not using the Fork-Join pool the way it was intended. You need to fragment the work into small sections. Then decompose() each section and combine the results.

compute() needs code such as:

if (work < max) {
   work left = split left half
   work right = split right half
   fork (left) 
   right.compute()
   left.join()
} 

Without splitting the work you are not going to use multiple threads. Using semaphores single threads the work and you will never see a speed up.

Look at the examples in the API for using this framework.