I am trying to find a fast way to rotate and reflect a 5x5 board to store it in a transposition table. The board is represented as a bitboard as they are very fast.
The bitboard is represented like this:
20 21 22 23 24
15 16 17 18 19
10 11 12 13 14
05 06 07 08 09
00 01 02 03 04
I have have found some solutions for 8x8 bitboards https://www.chessprogramming.org/Flipping_Mirroring_and_Rotating but I can't find a solution that works for a 5x5 bitboard, I also have tried looping through all of the bits and switching them but that was a very slow solution. (C++)
Transpose could be done with 4 delta swaps, it does not decompose as nicely as an 8x8 transpose because 5 is not a power of two. 4 delta swaps would cost about 24 operations and about 20 cycles (less than 24 thanks to instruction level parallelism, but not a lot less than 24 because the code would be mostly serial due to dependencies).
That would look like this:
Where
bit_permute_step
is the familiar delta swap.An alternative way is to use multiplication to extract columns. For example
x & 0b0000100001000010000100001
selects one column, then we can choose a constant to multiply by such that the bits from the column line up in a contiguous sequence in the top 5 bits of a 32-bit word. The rest of the word would contain other stray copies of those bits, but they will be shifted out. For example (shown in C#, should be easy to port)At 23 operations, this doesn't seem to have saved anything, but it could have a lower latency thanks to the increased opportunity for instruction level parallelism.
PEXT makes that nicer/simpler: (not tested)
64-bit pext would enable extracting two columns at once, but we have to concatenate the board with itself first because pext cannot reorder: (also not tested)
Flipping horizontally or vertically would be easy with bit-group moving.