High performance implement of atomic minimal operation

539 views Asked by At

There is no atomic minimal operation in OpenMP, also no intrinsic in Intel MIC's instruction set.

#pragmma omp critial is very insufficient in the performance.

I want to know if there is a high performance implement of atomic minimal for Intel MIC.

1

There are 1 answers

0
Kyle_the_hacker On

According to the OpenMP 4.0 Specifications (Section 2.12.6), there is a lot of fast atomic minimal operations you can do by using the #pragma omp atomic construct in place of #pragma omp critical (and thereby avoid the huge overhead of its lock).


Overview of the possibilities with the #pragma omp atomic construct

Let x be your thread-shared variable:

  • With #pragma omp atomic read you can atomically let your shared variable x be read:

    v = x;
    
  • With #pragma omp atomic write you can atomically assign a new value to your shared variable x; the new value expression (expr) has to be x-independant:

    x = expr;
    
  • With #pragma omp atomic update you can atomically update your shared variable x; in fact you can only assign a new value as a binary operation (binop) between an x-independant expression and x:

    x++;
    x--;
    ++x;
    --x;
    x binop= expr;
    x = x binop expr;
    x = expr binop x;
    
  • With #pragma omp atomic capture you can atomically let your shared variable x be read and updated (in the order you want); in fact capture is a combination of the read and update construct:

    • You have short forms for update and then read:

      v = ++x;
      v = --x;
      v = x binop= expr;
      v = x = x binop expr;
      v = x = expr binop x;
      
    • And their structured-block analogs:

      {--x; v = x;}
      {x--; v = x;}
      {++x; v = x;}
      {x++; v = x;}
      {x binop= expr; v = x;}
      {x = x binop expr; v = x;}
      {x = expr binop x; v = x;}
      
    • And you have a few short forms for read and then update:

      v = x++;
      v = x--;
      
    • And again their structured-block analogs:

      {v = x; x++;}
      {v = x; ++x;}
      {v = x; x--;}
      {v = x; --x;}
      
    • And finally you have additional read then update, which only exists in structured-block forms :

      {v = x; x binop= expr;}
      {v = x; x = x binop expr;}
      {v = x; x = expr binop x;}
      {v = x; x = expr;}
      

In the preceding expressions:

  • x and v are both l-value expressions with scalar type;
  • expr is an expression with scalar type;
  • binop is one of +, *, -, /, &, ^, |, << or >>;
  • binop, binop=, ++ and -- are not overloaded operators.