How is multiplication in Libfixmath performed?

149 views Asked by At

I am porting the library mentioned by Wikipedia https://en.wikipedia.org/wiki/Libfixmath in CMD Batch.

The library uses an Q16.16 format.

I am trying to port it in Q16.16 or Q15.15 format.

What I can't understand is in the part of the code in fix16.c.

The multiplication in the hardware case that supports 32bit.

fix16_t fix16_mul(fix16_t inArg0, fix16_t inArg1)
{
    // Each argument is divided to 16-bit parts.
    //                  AB
    //          *    CD
    // -----------
    //                  BD  16 * 16 -> 32 bit products
    //               CB
    //               AD
    //              AC
    //           |----| 64 bit product
    int32_t A = (inArg0 >> 16), C = (inArg1 >> 16);
    uint32_t B = (inArg0 & 0xFFFF), D = (inArg1 & 0xFFFF);
    
    int32_t AC = A*C;
    int32_t AD_CB = A*D + C*B;
    uint32_t BD = B*D;
    
    int32_t product_hi = AC + (AD_CB >> 16);
    
    // Handle carry from lower 32 bits to upper part of result.
    uint32_t ad_cb_temp = AD_CB << 16;
    uint32_t product_lo = BD + ad_cb_temp;
    if (product_lo < BD)
        product_hi++;
...

A, B, C and D are numbers at 16bit. My questions is In the calculation of the carry I do not understand how it does. It takes ad_cb and makes the shift, that is, it takes the 16 low bits of AD_CB and moves them upstairs and adds them to BD. The third operation I can't understand.

if (product_lo < BD)
        product_hi++;
...

Can you explain it better? How is it related to the previous ones?

Partial Porting:

:fix15_mul F1 F2 Ret

setlocal

    set "F1=%1"
    set "F2=%2"

    set /A "A=F1>>15, C=F2>>15"

    set /A "B=F1 & 0x7FFF, D=F2 & 0x7FFF"
    
    set /A "AC=A*C, AD_CB=A*D + C*B,BD =B*D"
    
    set /A "product_hi = AC + (AD_CB >> 15)"
    
    rem Handle carry from lower 30 bits to upper part of result.
    set /A "ad_cb_temp=AD_CB << 15, product_lo = BD + ad_cb_temp"

    set /A "product_lo2 =(AD_CB & 0x7FFF)<<15"

    if %product_lo% lss %BD% set /A "product_hi+=1"

    set /A "return=(product_hi << 15) | (product_lo >> 15)"

( endlocal & set "%3=%Return%" )

goto :eof

EDIT: One Focus only

1

There are 1 answers

2
Martin Brown On

I'll try and explain and provide a sample annotated modified code below (untested) which I think might work.

fix16_t fix16_mul(fix16_t inArg0, fix16_t inArg1)
{
    // Each argument is divided to 16-bit parts.
    //                  AB   *    CD
    // -----------
    //                BD  16 * 16 -> 32 bit products
    //               CB
    //               AD
    //              AC
    //             |----| 64 bit product
    int32_t  A = (inArg0 >> 16),    C = (inArg1 >> 16);
    uint32_t B = (inArg0 & 0xFFFF), D = (inArg1 & 0xFFFF);
    
    int32_t AC = A*C; // worst case bounds -2^15*(2^15-1), 2^30
    int32_t AD = A*D; // worst case bounds -2^15*(2^16-1),(2^15-1)*(2^16-1)
    int32_t BC = B*C; // ditto. The sum of these two terms could overflow 
    uint32_t BD = B*D; // bounds 0 .. (2^16-1)^2
    
    int32_t product_hi = AC + (AD >> 16) + (BC >> 16);
    
    // Handle carry from lower 32 bits to upper part of result.
    uint32_t ad_bc_temp = (AD << 16) + (BC << 16); // as unsigned
    uint32_t product_lo = BD + ad_bc_temp;
    // these are both now unsigned ints so that when adding together
    // if an overflow occurs then product_lo < BD
    if (product_lo < BD)
        product_hi++;   // so we increment the product_hi
    ...

"the carry I do not understand how it does. It takes ad_cb and makes the shift, that is, it takes the 16 low bits of AD_CB and moves them upstairs and adds them to BD. The third operation I can't understand."

It relies on the unsigned int value of BD always becoming smaller in the event of a carry occurring (and that only one carry is possible here). I suggest you validate this code against a few of the awkward edge cases comparing against a computation done in full int64 arithmetic. Values 0x8000ffff and 0x7FFFFFFF are obvious stress tests. It is easy to make a careless slip that only affects a handful of edge cases when doing this stuff.