parallel execution with a fixed order

55 views Asked by At
#!/bin/bash
doone() {
    tracelength="$1"
    short="$2"
    long="$3"
    ratio="$4"
    echo "$tracelength $short $long $ratio" >> results.csv
    python3 main.py "$tracelength" "$short" "$long" "$ratio" >> file.smt2
    gtime -f "%U" /Users/Desktop/optimathsat-1.5.1-macos-64-bit/bin/optimathsat < file.smt2
}
export -f doone
step=0.1
parallel doone \
         ::: 200 300 \
         :::: <(seq 0 $step 0.2) \
         ::::+ <(seq 1 -$step 0.8) \
         :::: <(seq 0 $step 0.1) \
         ::: {1..2} &> results.csv

I need the data given in the results.csv to be in order. Every job prints its inputs which are the 3 variable mentioned at the beginning : $tracelength, $short, $long and $ratio, and then the associated execution time of that job; all in one line. So far my results look something like this:

0.00
0.00
0.00
0.00
200 0 1 0
200 0 1 0.1
200 0.1 0.9 0

how can I fix the order? and why is the execution time always 0.00? file.smt2 is a big file, and in no way can the execution time be 0.00.

1

There are 1 answers

0
Ole Tange On BEST ANSWER

It is really a bad idea to append to the same file in parallel. You are going to have race conditions all over the place.

You are doing that with both results.csv and file.smt2.

So if you write to a file in doone make sure it is has a unique name (e.g. by using myfile.$$).

To see if race conditions are your problem, you can make GNU Parallel run one job at a time: parallel --jobs 1.

If the problem goes away by that, then you can probably get away with:

doone() {
    tracelength="$1"
    short="$2"
    long="$3"
    ratio="$4"
    # No >> is needed here, as all output is sent to results.csv
    echo "$tracelength $short $long $ratio"
    tmpfile=file.smt.$$
    cp file.smt2 $tmpfile
    python3 main.py "$tracelength" "$short" "$long" "$ratio" >> $tmpfile
    # Be aware that the output from gtime and optimathsat will be put into results.csv - making results.csv not a CSV-file
    gtime -f "%U" /Users/Desktop/optimathsat-1.5.1-macos-64-bit/bin/optimathsat < $tmpfile
    rm $tmpfile
}

If results.csv is just a log file, consider using parallel --joblog my.log instead.