How should I parallelize a computationally expensive for loop and collate the iteration results?

202 views Asked by At

I'm working on an 8-core machine and am performing a computationally heavy task. However, each execution of the task (i.e., iteration of for loop) is rather independent of the previous one. There are only some variables that are 'summed up' from one execution to the next. I'm guessing this is a good example for parallelizing/threading but I'm not sure how to go about it.

Here's how the code looks. As of now, it's just part of the main method in my main executor class:

double testerPayoffSum = 0.0, developerPayoffSum = 0.0;
Random seed = new Random();

        try {
            for (int i = 0; i < GameConstants.MAX_GAMES; i++) {
                EraserSimulator eraser = new EraserSimulator(GameConstants.MAX_TARGETS, GameConstants.MAX_RESOURCES, GameConstants.NUM_ATTACKER_TYPES, seed.nextInt());
                Map<Set<SingleObjectiveTarget>, Double> gameStrategy = eraser.run();

                assert (gameStrategy != null);

                TestingGameSimulator testingGame = new TestingGameSimulator(GameConstants.MAX_TARGETS, gameStrategy, GameConstants.NUM_GAMES_TO_STORE_FOR_HISTORY, GameConstants.NUM_TESTING_GAMES_TO_PLAY);                
                PlayerPayoffs payoffs = testingGame.run(eraser.getEraserInstance());

                testerPayoffSum += payoffs.getAverageTesterPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
                developerPayoffSum += payoffs.getAverageDeveloperPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
                System.out.print("Output: ERASER Games played; Number of developers caught");
                System.out.print(", " + GameConstants.NUM_TESTING_GAMES_TO_PLAY + ", " + payoffs.getNumTimesCaught() + "\n");

            } catch(Exception e){sendEmailAlert("Execution Failed with Exception");}

I'd like to parallelize the for-loop computation if possible and keep summing up the testerPayoffSum and developerPayofffSum variables. How might I achieve this?

Note: Each execution of the for loop takes about 20-30 minutes depending on the input size (as set by the various GameConstants). Even for a small number of MAX_GAMES the above takes close to 2-3 hours.

3

There are 3 answers

0
Smutje On BEST ANSWER

Create a thread object implementing Callable which returns a Future object containing your testerPayoffSum and developerPayoffSum, start the calculation and sum the results obtained from the Futures (See also https://blogs.oracle.com/CoreJavaTechTips/entry/get_netbeans_6).

0
Spektre On

are you absolutely sure you have no dependency ?

1.used classes must not share any variables

  • if it does then you have to add locks
  • but it will affect performance
  • if some shared variables are used extensively
  • then the performance can drop significantly even bellow the non-parallel execution

2.used classes must not use any kind of machine learning.

  • there is no solution for this
  • because parallelization will corrupt your results

Now how to do it (I am not JAVA coder so I stick to C++ code).

//--- globals and headers -----------------------------------------------------

unsigned long __stdcall function(LPVOID p);
Random seed = new Random();
const int N=8;          // threads count (<=CPU count)
int id[N];          // thread id
int max[N];         // number of games per thread
double testerPayoffSum[N];  // sum to separate variables to avoid locks need
double developerPayoffSum[N];
volatile int run=0,stop=0;  // thread control variables run is number of running threads and stop force stop...

//--- main code ---------------------------------------------------------------

// init some variables ... may be the seed init will be better here too
int i;                      
for (i = 0; i < N; i++) 
    {
    id[i]=i;
    max[i]=GameConstants.MAX_GAMES / N;
    testerPayoffSum[i]=0.0;
    developerPayoffSum[i]=0.0;
    } 
max[0]=GameConstants.MAX_GAMES % N;

// create threads
for (i = 0; i < N; i++)
    {
    HANDLE hnd=CreateThread(0,0,function,&id[i],0,0); 
    if (hnd!=NULL) CloseHandle(hnd); // this line is important !!! 
    // because if you do not close Handle it will be allocated until the end of app
    // handle leaks are nasty and cause weird OS behaviour
    // I saw many times this bug in commercial drivers
    // it is a nightmare for 24/7 software
    }

// wait for them
while (run) Sleep(200);

// sum the results to [0]
for (i = 1; i < N; i++)
    {
    testerPayoffSum[0]   +=testerPayoffSum[i];
    developerPayoffSum[0]+=developerPayoffSum[i];
    }
// here do what you need to do with the results

//--- thread function ---------------------------------------------------------
unsigned long __stdcall function(LPVOID p)
    {
    run++;
    int ix=((int*)p)[0];
    for (i = 0; i < max[ix]; i++)
        {
        if (stop) break;
        EraserSimulator eraser = new EraserSimulator(GameConstants.MAX_TARGETS, GameConstants.MAX_RESOURCES, GameConstants.NUM_ATTACKER_TYPES, seed.nextInt());
        Map<Set<SingleObjectiveTarget>, Double> gameStrategy = eraser.run();
        assert (gameStrategy != null);
        TestingGameSimulator testingGame = new TestingGameSimulator(GameConstants.MAX_TARGETS, gameStrategy, GameConstants.NUM_GAMES_TO_STORE_FOR_HISTORY, GameConstants.NUM_TESTING_GAMES_TO_PLAY);                
        PlayerPayoffs payoffs = testingGame.run(eraser.getEraserInstance());
        testerPayoffSum[ix] += payoffs.getAverageTesterPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
        developerPayoffSum[ix] += payoffs.getAverageDeveloperPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
//      do not call any visual stuff from thread !!! sometimes it can cause a lot of problems ...
//      instead cretae some global string variable and set it to what shoud be printed out
//      and inside wait while loop in main code add if string != "" then System.out.print(string);
//      but in that case you should add lock to it.
//      System.out.print("Output: ERASER Games played; Number of developers caught");
//      System.out.print(", " + GameConstants.NUM_TESTING_GAMES_TO_PLAY + ", " + payoffs.getNumTimesCaught() + "\n");
        //Sleep(100); // well placed sleep
        }
    run--;
    }

[Notes]

  • from your code I am assuming that GameConstants is shared variable !!!
  • if it is only for read than it is OK
  • but if you do also write to it inside thread (I suspect that yes)
  • then you have a big problem because you need to add locks inside your game class then ...
  • if no machine learning is done then you could avoid this
  • by creating separate GameConstants variables for each thread like ... GameConstants[N]
  • but you need to rewrite the code so it access the GameConstants[ix] and not GameConstants

[lock]

  • have no clue how locks are implemented in JAVA
  • but you can also use your own something like this

    class _lock
     {
    public:
     volatile bool locked;
     _lock() { locked=false; }
     void lock() { while(locked) Sleep(1); locked=true; }
     void unlock() { locked=false; }
     };
    
    // now for each shared variable (or group of variables) add one global _lock variable
    _lock l1; int sv1; // shared variable 1 and her lock
    
    // any write access and sometimes also read access needs lock
    l1.lock();
    sv1++;
    l1.unlock();
    
  • beware that locks can sometimes cause App freeze especially while heavy duty use.

  • does not matter if it is own lock or OS lock
  • this occurs mainly while mixing visual stuff or some OS calls inside threads and not in main thread
  • in that case sometimes a well placed sleep helps but avoid OS calls inside threads if you can
  • because it cause very many other problems ...
  • also try to be locked as small time as possible because in case of conflict the conflicting threads are stopped !!!
  • therefore you cannot just add lock at the start of loop and unlock at the end
  • because the parallelism speedup will be lost then
0
Alexei Kaigorodov On

Declare a queue to collect results and submit tasks to a thread pool:

    final ArrayBloclingQueue<PlayerPayoffs> queue=new ArrayBloclingQueue<PlayerPayoffs>();
    Executor exec=new Executors.newFixedThreadPool(N); // number of threads depends on hardware
    for (int i = 0; i < GameConstants.MAX_GAMES; i++) {
        exec.execute(new Runnable(){          
            EraserSimulator eraser = new EraserSimulator(GameConstants.MAX_TARGETS, GameConstants.MAX_RESOURCES, GameConstants.NUM_ATTACKER_TYPES, seed.nextInt());
            Map<Set<SingleObjectiveTarget>, Double> gameStrategy = eraser.run();

            assert (gameStrategy != null);

            TestingGameSimulator testingGame = new TestingGameSimulator(GameConstants.MAX_TARGETS, gameStrategy, GameConstants.NUM_GAMES_TO_STORE_FOR_HISTORY, GameConstants.NUM_TESTING_GAMES_TO_PLAY);      
            PlayerPayoffs payoffs = testingGame.run(eraser.getEraserInstance());
            queue.put(payoffs);
        });
    }

Then collect and sum results:

    double testerPayoffSum = 0.0, developerPayoffSum = 0.0;
    for (int i = 0; i < GameConstants.MAX_GAMES; i++) {
            PlayerPayoffs payoffs=queue.take();
            testerPayoffSum += payoffs.getAverageTesterPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
            developerPayoffSum += payoffs.getAverageDeveloperPayoff(GameConstants.NUM_TESTING_GAMES_TO_PLAY);
            System.out.print("Output: ERASER Games played; Number of developers caught");
            System.out.print(", " + GameConstants.NUM_TESTING_GAMES_TO_PLAY + ", " + payoffs.getNumTimesCaught() + "\n");
    }