Some days ago it came to my mind, that a piece of code to implement a min/max reduction in OpenMP, that i used quite often, might actually not be correct:
In some cases, when the OpenMP min-max reduction clause was not available (old OpenMP version) or i also needed the index for the maximum value i used code like this:
#pragma omp parallel private(myMax,myMax_idx) shared(globalMax,globalMax_idx)
{
#pragma omp for
for (...) {
}
if (myMax >= globalMax) {
#pragma omp critical
{
if ((myMax > globalMax)||(myMax == globalMax && globalMax_idx < myMax_idx) {
globalMax = myMax;
globalMax_idx = myMax_idx;
}
}
}
}
Now it came to my mind, that this code might actually produce wrong results because shared variable does NOT mean that all threads access the same portion of memory, but they might use a private copy that might not be up to date with all other threads.
So i need to use #pragma omp flush
to synchronize the variable.
[...]
#pragma omp flush(globalMax)
if (myMax > globalMax) {
#pragma omp critical
{
if (myMax > globalMax) globalMax = myMax;
}
}
[...]
In M. Süß et al, Common Mistakes in OpenMP and How To Avoid Them this implementation is described as
This is essentially a reimplementation of a reduction using the max operator.
But i wonder if this piece of code is correct because i don't see the writing thread flushing his version of globalMax
to the memory.
Also in the case of searching the index i would need to also flush the globalMax_idx
variable. Right?
This question is kind of related to
- OpenMp C++ algorithms for min, max, median, average, but the accepted answer to that question does not use
flush
at all, so i'm unsure if its really robust. - explict flush direcitve with OpenMP: when is it necessary and when is it helpful which links to this tutorial which says that the critical region does a flush on enter and exit (i have not yet found this information in a different place though...), which would make my naive implementation and the answer to the question above correct.
So if the Code from the "Common Mistakes in OpenMP" is assuming that the critical region does a flush is it really worth it to explicitly flush the globalMax
-variable before the if
?
What code should i use?