How to bitwise flip an char array by diagonal effectively in C/C++ or Cuda?

Question

How to bitwise flip an char array by diagonal effectively in C/C++ or Cuda?

335 views Asked by Xiangyu.Wu At 25 December 2024 at 13:36

I have a char array char input[8] = "abcdabcd", and I want to bitwise flip it by diagonal, which mean
input:

input[0] == 'a': 0 1 1 0 0 0 0 1
input[1] == 'b': 0 1 1 0 0 0 1 0
input[2] == 'c': 0 1 1 0 0 0 1 1
input[3] == 'd': 0 1 1 0 0 1 0 0
input[4] == 'a': 0 1 1 0 0 0 0 1
input[5] == 'b': 0 1 1 0 0 0 1 0
input[6] == 'c': 0 1 1 0 0 0 1 1
input[7] == 'd': 0 1 1 0 0 1 0 0

output:

                   a b c d a b c d
output[0] ==   0 : 0 0 0 0 0 0 0 0
output[1] == 255 : 1 1 1 1 1 1 1 1
output[2] == 255 : 1 1 1 1 1 1 1 1
output[3] ==   0 : 0 0 0 0 0 0 0 0
output[4] ==   0 : 0 0 0 0 0 0 0 0
output[5] ==  17 : 0 0 0 1 0 0 0 1
output[6] == 102 : 0 1 1 0 0 1 1 0
output[7] == 170 : 1 0 1 0 1 0 1 0

It is obvious that we can use two loop combined with bitwise or operation to set the target bit one by one, however, this mean we need at least 64 * n operations, which I think is not effectively.
Since the input and output are just about reading memory in different directions (by row or by column), are there any more effectively way?
Besides, I think it is quite acceptable and make sense to do this operation based on special memory layout, or change the number or characters in array.
Thanks!

Original Q&A

There are 1 answers

**yhf8377** · Accepted Answer · 2016-12-13T16:31:49+00:00

Here is my code based on the trick from Hacker's Delight. Although it is CPU code, it can be easily transformed into parallel CUDA code.

This code itself is for transposing bitmap of arbitrary sizes. What you really need is the code for transposing an uint64_t x to another uint64_t y.

using BitBlock = uint8_t;
using BitBlocks = std::vector<BitBlock>;

void FPTransMap::transpose_bitmap( BitBlocks& bitmap, size_type blocks_per_row )
{
    assert( bitmap.size() % blocks_per_row == 0 );
    assert( ( bitmap.size() / blocks_per_row ) % 8 == 0 );

    BitBlocks transposed( bitmap.size() );
    size_type nrow = bitmap.size() / blocks_per_row, row_blocks = nrow / 8;
    for ( index_type i = 0; i < row_blocks; ++i ) {
        for ( index_type j = 0; j < blocks_per_row; ++j ) {
            uint64_t x = ( uint64_t( bitmap[   i * 8       * blocks_per_row + j ] ) << 56 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 1 ) * blocks_per_row + j ] ) << 48 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 2 ) * blocks_per_row + j ] ) << 40 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 3 ) * blocks_per_row + j ] ) << 32 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 4 ) * blocks_per_row + j ] ) << 24 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 5 ) * blocks_per_row + j ] ) << 16 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 6 ) * blocks_per_row + j ] ) <<  8 ) |
                         ( uint64_t( bitmap[ ( i * 8 + 7 ) * blocks_per_row + j ] ) );
            uint64_t y = (x & 0x8040201008040201LL) |
                        ((x & 0x0080402010080402LL) <<  7) |
                        ((x & 0x0000804020100804LL) << 14) |
                        ((x & 0x0000008040201008LL) << 21) |
                        ((x & 0x0000000080402010LL) << 28) |
                        ((x & 0x0000000000804020LL) << 35) |
                        ((x & 0x0000000000008040LL) << 42) |
                        ((x & 0x0000000000000080LL) << 49) |
                        ((x >>  7) & 0x0080402010080402LL) |
                        ((x >> 14) & 0x0000804020100804LL) |
                        ((x >> 21) & 0x0000008040201008LL) |
                        ((x >> 28) & 0x0000000080402010LL) |
                        ((x >> 35) & 0x0000000000804020LL) |
                        ((x >> 42) & 0x0000000000008040LL) |
                        ((x >> 49) & 0x0000000000000080LL);
            transposed[ ( j * 8 ) * row_blocks + i ]     = uint8_t( ( y >> 56 ) & 0xFF );
            transposed[ ( j * 8 + 1 ) * row_blocks + i ] = uint8_t( ( y >> 48 ) & 0xFF );
            transposed[ ( j * 8 + 2 ) * row_blocks + i ] = uint8_t( ( y >> 40 ) & 0xFF );
            transposed[ ( j * 8 + 3 ) * row_blocks + i ] = uint8_t( ( y >> 32 ) & 0xFF );
            transposed[ ( j * 8 + 4 ) * row_blocks + i ] = uint8_t( ( y >> 24 ) & 0xFF );
            transposed[ ( j * 8 + 5 ) * row_blocks + i ] = uint8_t( ( y >> 16 ) & 0xFF );
            transposed[ ( j * 8 + 6 ) * row_blocks + i ] = uint8_t( ( y >> 8 ) & 0xFF );
            transposed[ ( j * 8 + 7 ) * row_blocks + i ] = uint8_t( y & 0xFF );
        }
    }
    std::swap( bitmap, transposed );
}

TechQA.

How to bitwise flip an char array by diagonal effectively in C/C++ or Cuda?

There are 1 answers

Related Questions in C++

Related Questions in CUDA

Related Questions in BIT-MANIPULATION

Related Questions in DIAGONAL

Related Questions in MEMORY-LAYOUT

Popular Questions

Popular Tags

Trending Questions