Unsigned Right Shift / Zero-fill Right Shift / >>> in PHP (Java/JavaScript equivalent)

1.7k views Asked by At

Before flagging this as a duplicate, please read below, and check my code * my updated code!

So my problem is that, I have to implement Java/JavaScript '>>>' (Unsigned Right Shift / Zero-fill Right Shift), but I can't get it work exactly the same way.

I've selected the 11 most promising implementations I've found on SO and on the web (links are added as comments in the code) and added a few test cases. Unfortunately NONE of the functions returned the same response as Java/JS to ALL of the tests. (Maybe some of them are only working on 32bit systems)

Live Code + JS+PHP results demo (click Run):
http://phpfiddle.org/main/code/bcv7-bs2q *
http://phpfiddle.org/main/code/dpkw-rxfe

The closest functions are:

// http://stackoverflow.com/a/27263298
function shr9($a,$b) { 
    if($a>=0) return $a>>$b;
    if($b==0) return (($a>>1)&0x7fffffff)*2+(($a>>$b)&1);
    return ((~$a)>>$b)^(0x7fffffff>>($b-1)); 
}

and

// http://stackoverflow.com/a/25467712
function shr11($a, $b) { 
    if ($b > 32 || $b < -32) {
        $m = (int)($b/32);
        $b = $b-($m*32);
    }

    if ($b < 0)
        $b = 32 + $b;

    if ($a < 0) 
    { 
        $a = ($a >> 1); 
        $a &= 2147483647; 
        $a |= 0x40000000; 
        $a = ($a >> ($b - 1)); 
    } else { 
        $a = ($a >> $b); 
    } 
    return $a; 
}

Unfortunately shr9 fails on (-10 >>> -3) and * (32 >> 32), but is the only to pass (-3 >>> 0); and shr11 fails on (-3 >>> 0) and also (32 >>> 32).

Test cases:

         0 >>> 3    == 0 
         3 >>> 0    == 3 
         0 >>> -3   == 0 
        -3 >>> 0    == 4294967293 (in JS); -3 (in Java)  
        10 >>> 3    == 1 
        10 >>> -3   == 0 
       -10 >>> 3    == 536870910 
       -10 >>> -3   == 7 
-672461345 >>> 25   == 107 
        32 >>> 32   == 32 
       128 >>> 128  == 128 

Edit: I found that -3 >>> 0 is equals 4294967293 only in JavaScript (why?), but in Java, it equals -3. Unfortunately, this doesn't change the fact that I still can't get any function to pass all tests.


*BIG UPDATE:

Since PHP 7, bit shift by a negative number is considered to be invalid and causes: "Fatal error: Uncaught ArithmeticError: Bit shift by negative number". According to this, I think we don't have to pass those tests, so I've updated the question and the codes.

4

There are 4 answers

1
frzsombor On BEST ANSWER

After looking into the two functions from the question ("shr9" and "shr11") and merging/tweaking the good parts, I finally found the solution. All tests passed (I even added more in the demo), and it also works for shifts by a negative number.

[Live Demo]

function unsignedRightShift($a, $b) {
    if ($b >= 32 || $b < -32) {
        $m = (int)($b/32);
        $b = $b-($m*32);
    }

    if ($b < 0) {
        $b = 32 + $b;
    }

    if ($b == 0) {
        return (($a>>1)&0x7fffffff)*2+(($a>>$b)&1);
    }

    if ($a < 0) 
    { 
        $a = ($a >> 1); 
        $a &= 0x7fffffff; 
        $a |= 0x40000000; 
        $a = ($a >> ($b - 1)); 
    } else { 
        $a = ($a >> $b); 
    } 
    return $a; 
}

This code is not just accurate, but fast too.
Benchmark results: 100000 loops in: 0.25 sec
Benchmark test: http://phpfiddle.org/main/code/mj68-1s7e

0
frzsombor On

As I was really out of ideas, I cloned the Chromium V8 engine and the Mozilla Central repo for getting SpiderMonkey. I started searching for the JS >>> operator, and finally in Mozilla's code, I found an almost 20 years old file (from 1997), js/src/tests/ecma/Expressions/11.7.3.js, which contained the operator-less code for testing "zero-filling bitwise right shift operation". After rewriting this in PHP, this code passed all the tests.

[LiveDemo]

<?php

function ToInteger( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( $n != $n ) {
    return 0;
  }
  if ( abs( $n ) == 0 || abs( $n ) == INF ) {
    return $n;
  }
  return intval( $sign * floor(abs($n)) );
}

function ToInt32( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = ($sign * floor( abs($n) )) % pow(2,32);
  $n = ( $n >= pow(2,31) ) ? $n - pow(2,32) : $n;

  return ( $n );
}

function ToUint32( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = $sign * floor( abs($n) );
  $n = $n % pow(2,32);

  if ( $n < 0 ){
    $n += pow(2,32);
  }

  return ( $n );
}

function ToUint16( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = ( $sign * floor( abs($n) ) ) % pow(2,16);

  if ($n <0) {
    $n += pow(2,16);
  }

  return ( $n );
}

function Mask( $b, $n ) {
  $b = ToUint32BitString( $b );
  $b = substr( $b, strlen($b) - $n );
  $b = ToUint32Decimal( $b );
  return ( $b );
}

function ToUint32BitString( $n ) {
  $b = "";
  for ( $p = 31; $p >=0; $p-- ) {
    if ( $n >= pow(2,$p) ) {
      $b .= "1";
      $n -= pow(2,$p);
    } else {
      $b .= "0";
    }
  }
  return $b;
}

function ToInt32BitString( $n ) {
  $b = "";
  $sign = ( $n < 0 ) ? -1 : 1;

  $b .= ( $sign == 1 ) ? "0" : "1";

  for ( $p = 30; $p >=0; $p-- ) {
    if ( ($sign == 1 ) ? $sign * $n >= pow(2, $p) : $sign * $n > pow(2,$p) ) {
      $b .= ( $sign == 1 ) ? "1" : "0";
      $n -= $sign * pow( 2, $p );
    } else {
      $b .= ( $sign == 1 ) ? "0" : "1";
    }
  }

  return $b;
}

function ToInt32Decimal( $bin ) {
  $r = 0;
  $sign;

  if ( intval($bin[0]) == 0 ) {
    $sign = 1;
    $r = 0;
  } else {
    $sign = -1;
    $r = -(pow(2,31));
  }

  for ( $j = 0; $j < 31; $j++ ) {
    $r += pow( 2, $j ) * intval($bin[31-$j]);
  }

  return $r;
}

function ToUint32Decimal( $bin ) {
  $r = 0;


  for ( $l = strlen($bin); $l < 32; $l++ ) {
    $bin = "0" . $bin;
  }

  for ( $j = 0; $j < 32; $j++ ) {
    $r += pow( 2, $j ) * intval($bin[31-$j]);

  }

  return $r;
}

function RShift( $s, $a ) {
  $s = ToUint32BitString( $s );
  for ( $z = 0; $z < $a; $z++ ) {
    $s = "0" . $s;
  }
  $s = substr( $s, 0, strlen($s) - $a );

  return ToUint32Decimal($s);
}

function UnsignedRightShift( $s, $a ) {
  $s = ToUint32( $s );
  $a = ToUint32( $a );
  $a = Mask( $a, 5 );
  return ( RShift( $s, $a ) );
}

Usage example: UnsignedRightShift(10, 3); ( = 10 >>> 3 )

Disclaimer: I know that this is not even close to a "professional" solution, the performance is bad (4.33s with 110,000 loops; functions in question finish ~0.04s with 110,000 loops), and maybe there are even unnecessary functions in this snippet, but currently I only had time to implement it line by line. If anyone has a better solution, better performance or cleaner code, I'm more than happy to see it.

0
JSON On

What this comes down to is each language's concepts of numbers, how they store numbers internally, as well as the overall scope of how each language is able to handle numbers.

Generally speaking, PHP stores all numbers as 64 bit ints, declared internally as a long long. Like in C/C++, a PHP numbers actual value in memory is always the same regardless of signedness. Declaring a int as signed/unsigned only effects how the value is later interpreted in order to allow the concept of negative values. Because of this, in C, all you need to do in order to perform an “unsigned shift” is cast the value to unsigned

unsigned int unsigned_shift_result = (unsigned int)somevalue >> someothrervalue;

In PHP this is done automatically, so -1 >> 0 in PHP is the same as -1 >>> 0 in Java.

Java and Javascript are somewhat different because they both have features for handling the concept of “big numbers”, that is a number that is greater than the systems INT_MAX/INT64_MAX, such as 90000000000000000000000000000000000000000000000000000000000000000000. Because of this a value may be greater than the system can handle using buildin arithmetic and bitwise instructions, so these languages includes a few additional facilities for handling numbers such as an unsigned shift. This is because 90000000000000000000000000000000000000000000000000000000000000000000 and -90000000000000000000000000000000000000000000000000000000000000000000 (as well as how they are handled) have very significant fundamental differences when compared to 1000 and -1000

With that said, Java and Javascript still deal with numbers the same as PHP, C, etc when they are within the range of what the system can handle naively. I don't see why they muddied the water by throwing in an "unsigned shift". No matter how you break it down, your just dealing with a virtual instruction as well as other higher level concepts of how a value is being represented.

1
floriskn On

I encountered an issue with the approved function attempting to implement an unsigned right shift operation. The function showed inaccuracies, especially with large numbers. Here's an overview of my revised implementation:

The initial block, which calculates the modulo of $b for values ​​above 32, can be simplified to $b %= 32. This modification ensures that we only consider the last 5 bits of the number.

Then I combined the operations for adjusting $b when it is less than 0, and used $b &= 0x1f. This bitwise AND operation with 0x1f sets the last 5 bits and produces a positive value, eliminating the need for an additional check. This is because if the value was lower than 0, the AND operation already 'wraps' the bits. Because the bits where already flipped when the number was negative.

Also we need to do $a & 0xffffffff. This change is necessary because PHP uses 64 bits for numbers, and JavaScript performs bitwise operations on 32-bit integers. By ensuring that the first 32 bits are zero, we achieve an unsigned representation, so we can safely right shift without having to fill the sign bit with a 0.

However, in large numbers the function still failed. For example 23098347343453233 >>> 78343434589435 returns 172096098 in PHP, but Javascript returns 2. To align PHP behavior with JavaScript, I cast $a and $b to float and applied the bitwise AND operation, because as previously explained PHP uses 64 bits and Javascript 32bits for bitwise operations, resulting in:

function unsignedRightShift($a, $b) {
    return ((float)$a & 0xffffffff) >> ((float)$b & 0x1f);
}

This corrected function now consistently produces results that match JavaScript for the various test cases, including large numbers.