I was doing some experiments with Intel Advisor 2020 and in particular with the roofline model. Something I can't quite understand is why the peak scalar integer performance (intop/cycle) is different than the theoretical one that I would expect especially since all other metrics match more or less (vector integer performance, floating point..)
In particular according to Intel Advisor the max peak performance (for add) is around 2.3 integer operations per cycle while the theoretical value I would expect to find is 4 intop/cycle since we have 4 INT ALU in 4 different ports.
Am I missing something?