Sequential loop, critical section, atomic update.
I have a 2D graph consisting of vertices and edges.
For each vertex, I want to compute the number of degrees of freedom (dof). A vertex dof = the number of edges connected to the vertex.
What I do is:
std::vector<size_t> dofs(graph.getNumVertices(), 0);
#pragma omp parallel for
for (int edgeInd = 0; edgeInd < graph.getNumEdges(); ++edgeInd)
{
#pragma omp atomic
++dofs[graph.getEdge(edgeInd).v1.index];
#pragma omp atomic
++dofs[graph.getEdge(edgeInd).v2.index];
}
Which works as expected.
My questions:
All edges are distributed among different threads. Let's assume there are only two threads: 1 and 2.
An edge can have:
- a vertex "exclusively owned" by thread 1 - only thread 1 has access
- a vertex "exclusively owned" by thread 2 - only thread 2 has access
- a vertex "shared" between thread 1 and thread 2 - both thread 1 and thread 2 have access
Question 1: Does atomic update "activate itself" only when needed? And is the critical section (if applied instead) active all the time?
Meaning, is the following table correct (?):
An edge can have: ATOMIC CRITICAL
- a vertex "exclusively owned" by thread 1 OFF ON
- a vertex "exclusively owned" by thread 2 OFF ON
- a vertex "shared" between thread 1 and thread 2 ON ON
Question 2:
Theoretically, I expect:
- atomic update || loop - fastest,
- sequential loop - slower than 1),
- critical section || loop - slowest, slower than 2).
Correct?
As the comment by @PierU observed, your atomic directive makes the code correct, but probably effectively serial. You need a different approach.
See my previous answers to very similar questions:
https://stackoverflow.com/a/77273690/2044454
https://stackoverflow.com/a/76854330/2044454
This is considerably easier to code in C++, but you can emulate it in C.