Optimal scheduling and placement of embarrassingly parallel job

69 views Asked by At

I'm trying to optimize run time of an embarrassingly parallel code. I'm hoping there is something out there that can do this already and I just didn't see it in my searches.

Info...

  • The code is embarrassingly parallel
  • There are 100 runs overall, and I would like to be able to run 20 instances concurrently
  • Runtime and RAM requirements scale ~linearly from job 1 (5 hours, 20GB) to 100 (10 hours, 30GB)
  • Individual instances of the code use threaded GEMM calls (currently Intel MKL, 2 threads per job)
  • The computer has 2 NUMA nodes (Dual socket system)
  • I am restricted on the amount of RAM (I cannot do all of the higher memory runs at one time)
  • Currently some jobs cross over between NUMA nodes and slow everything to a crawl

How can I optimally schedule these jobs in terms of time/RAM placement in a NUMA aware fashion? Is there a script or scheduling system that can handle doing this already or would I need to roll my own?

I feel like I have more questions about implementing this, but I think it will just make it more confusing.

0

There are 0 answers