What CPU state has an effect on intel FPU and SSE performance?

341 views Asked by At

In trying to track down a performance issue, I ended up looking for information on what can have an effect on the performance of x87 and SSE instructions. I found that information incredibly difficult to track down as it tends to be hidden deep inside large Intel PDFs or sometimes mentioned on 3rd party websites without much explanation.

This question is about control words, bits, modes, specific data (eg. denormals), whatever. It is not about memory bandwidth, cache, page tables, alignment or anything else memory related. I'll answer with a basic list of I've found so far but feel free to add more details or new state I'm not aware of.

1

There are 1 answers

2
Olivier On

So far, I've found:

  • The FPU Control World (FCW). This has a precision field which affects the speed of some operations. It is mostly obsolete as it only affects x87 instructions as far as I can tell.
  • The MXCSR register. This affects SSE math through the DAZ (denormals are zero) and FTZ (flush to zero) bits. Calculations with denormals are slower.
  • The state of the upper part of AVX registers. Cleared with the vzeroupper instruction. There is a very technical discussion about it on the intel forums: Software consequences of extending XMM to YMM