Why is fetestexcept in C++ compiled to a function call rather than inlined

Question

Why is fetestexcept in C++ compiled to a function call rather than inlined

279 views Asked by Ikaros At 25 January 2021 at 20:19

I am evaluating the usage (clearing and querying) of Floating-Point Exceptions in performance-critical/"hot" code. Looking at the binary produced I noticed that neither GCC nor Clang expand the call to an inline sequence of instructions that I would expect; instead they seem to generate a call to the runtime library. This is prohibitively expensive for my application.

Consider the following minimal example:

#include <fenv.h>
#pragma STDC FENV_ACCESS on

inline int fetestexcept_inline(int e)
{
  unsigned int mxcsr;
  asm volatile ("vstmxcsr" " %0" : "=m" (*&mxcsr));
  return mxcsr & e & FE_ALL_EXCEPT;
}

double f1(double a)
{
    double r = a * a;
    if(r == 0 || fetestexcept_inline(FE_OVERFLOW)) return -1;
    else return r;
}

double f2(double a)
{
    double r = a * a;
    if(r == 0 || fetestexcept(FE_OVERFLOW)) return -1;
    else return r;
}

And the output as produced by GCC: https://godbolt.org/z/jxjzYY

The compiler seems to know that he can use the CPU-family-dependent AVX-instructions for the target (it uses "vmulsd" for the multiplication). However, no matter which optimization flags I try, it will always produce the much more expensive function call to glibc rather than the assembly that (as far as I understand) should do what the corresponding glibc function does.

This is not intended as a complaint, I am OK with adding the inline assembly. I just wonder whether there might be a subtle difference that I am overlooking that could be a bug in the inline-assembly-version.

Original Q&A

There are 1 answers

**Florian Weimer** · Accepted Answer · 2021-01-26T06:32:44+00:00

Florian Weimer On 26 January 2021 at 06:32 BEST ANSWER

It's required to support long double arithmetic. fetestexcept needs to merge the SSE and FPU states because long double operations only update the FPU state, but not the MXSCR register. Therefore, the benefit from inlining is somewhat reduced.

TechQA.

Why is fetestexcept in C++ compiled to a function call rather than inlined

There are 1 answers

Related Questions in C++

Related Questions in GCC

Related Questions in AVX

Related Questions in FLOATING-POINT-EXCEPTIONS

Related Questions in FENV

Popular Questions

Popular Tags

Trending Questions