I'm looking at IBM's Developing PowerPC Embedded Application Binary (EABI) Compliant Programs, in particular at table 4 on page 7.
The benchmark results are 88kDhry/sec without using SDA, and only 77kDhry/sec with SDA. I would have expected using SDA not only reduce code size but also improve performance because access to variables only needs two instead of three instructions. Can somebody explain the numbers in the table?
What am I missing?