For example, all white pawns' attacks are either generated by shifting 7 or 9 bits to the left (or right, I could be mistaken, but I think it's easy to get the gist).
So the white pawn bitboard that looks like this
00000000000000000000000000000000000000001111111100000000
Would be shifted into this
00000000000000000000000000000001111111100000000000000000
However, if you attempt to portray this example in a 8x8 array, one of the pawns goes through the chessboard's edge.
So when the bits are shifted to generate attacks, how do the engines prevent themselves from generating ones that go through the chessboard's edges?
We'll add some whitespace. Suppose you start with a pawn position:
The newlines are just there for ease of reading, everything is packed bits. Where can the pawns move?
here, the only risk of going past an edge is at the top, which is easy to deal with.
Next, where can the pawns take? The naive solution is you take the above "move" mask, and shift it left and right 1. As you noticed, this causes wraparound.
(apologies if you dislike my choice of endianness).
We can fix this with some masking. The bits that will wrap around are at known locations. So we mask them out before we do the operation.
This mask:
before we shift right, and
before we shift left.
So to see the spots threatened by pawns, where "forward" is a right shift, we do:
and there we have it.
Similar masking can be done for other cases. For example, if we are calculating the knight take mask we need 8 masks and 8 shifts, then or them all together.
For a rook moving right 3, you mask out the right 3 columns, then do a right 3 shift on the rook mask.
At some point, however, you are likely to want to use intrinsics and SIMD processing; so the exact bit operations you have that are fast are going to depend on your handware. Maybe unpacking 8 8 bit values into 8 24 bit values, doing your mask-free operation there, then extracting the middle 8 bits is the fastest way on your particular hardware.