Take the average of two signed numbers in C

4.5k views Asked by At

Let us say we have x and y and both are signed integers in C, how do we find the most accurate mean value between the two?

I would prefer a solution that does not take advantage of any machine/compiler/toolchain specific workings.

The best I have come up with is:(a / 2) + (b / 2) + !!(a % 2) * !!(b %2) Is there a solution that is more accurate? Faster? Simpler?

What if we know if one is larger than the other a priori?

Thanks.

D


Editor's Note: Please note that the OP expects answers that are not subject to integer overflow when input values are close to the maximum absolute bounds of the C int type. This was not stated in the original question, but is important when giving an answer.

7

There are 7 answers

5
AProgrammer On BEST ANSWER

Edit: version fixed by @chux - Reinstate Monica:

if ((a < 0) == (b < 0)) {  // a,b same sign
  return a/2 + b/2 + (a%2 + b%2)/2;
} else {
  return (a+b)/2;
}

Original answer (I'd have deleted it if it hadn't been accepted).

a/2 + b/2 + (a%2 + b%2)/2

Seems the simplest one fitting the bill of no assumption on implementation characteristics (it has a dependency on C99 which specifying the result of / as "truncated toward 0" while it was implementation dependent for C90).

It has the advantage of having no test (and thus no costly jumps) and all divisions/remainder are by 2 so the use of bit twiddling techniques by the compiler is possible.

1
Toni Rossmann On

This answer fits to any number of integers:

    int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    decimal avg = 0;
    for (int i = 0; i < array.Length; i++){
        avg = (array[i] - avg) / (i+1) + avg;
    }

expects avg == 5.0 for this test

5
R.. GitHub STOP HELPING ICE On

If (a^b)<=0 you can just use (a+b)/2 without fear of overflow.

Otherwise, try (a-(a|b)+b)/2+(a|b)/2. -(a|b) is at least as large in magnitude as both a and b and has the opposite sign, so this avoids the overflow.

I did this quickly off the top of my head so there might be some stupid errors. Note that there are no machine-specific hacks here. All behavior is completely determined by the C standard and the fact that it requires twos-complement, ones-complement, or sign-magnitude representation of signed values and specifies that the bitwise operators work on the bit-by-bit representation. Nope, the relative magnitude of a|b depends on the representation...

Edit: You could also use a+(b-a)/2 when they have the same sign. Note that this will give a bias towards a. You can reverse it and get a bias towards b. My solution above, on the other hand, gives bias towards zero if I'm not mistaken.

Another try: One standard approach is (a&b)+(a^b)/2. In twos complement it works regardless of the signs, but I believe it also works in ones complement or sign-magnitude if a and b have the same sign. Care to check it?

7
Santiago Alessandri On

I would do this, convert both to long long(64 bit signed integers) add them up, this won't overflow and then divide the result by 2:

((long long)a + (long long)b) / 2

If you want the decimal part, store it as a double.

It is important to note that the result will fit in a 32 bit integer.

If you are using the highest-rank integer, then you can use:

((double)a + (double)b) / 2
1
Guy Sirton On

Just a few observations that may help:

"Most accurate" isn't necessarily unique with integers. E.g. for 1 and 4, 2 and 3 are an equally "most accurate" answer. Mathematically (not C integers):

(a+b)/2 = a+(b-a)/2 = b+(a-b)/2

Let's try breaking this down:

  • If sign(a)!=sign(b) then a+b will will not overflow. This case can be determined by comparing the most significant bit in a two's complement representation.
  • If sign(a)==sign(b) then if a is greater than b, (a-b) will not overflow. Otherwise (b-a) will not overflow. EDIT: Actually neither will overflow.

What are you trying to optimize exactly? Different processor architectures may have different optimal solutions. For example, in your code replacing the multiplication with an AND may improve performance. Also in a two's complement architecture you can simply (a & b & 1).

I'm just going to throw some code out, not looking too fast but perhaps someone can use and improve:

int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a
0
Nishant On

For unsigned integers the average is the floor of (x+y)/2. But the same fails for signed integers. This formula fails for integers whose sum is an odd -ve number as their floor is one less than their average.

You can read up more at Hacker's Delight in section 2.5

The code to calculate average of 2 signed integers without overflow is

int t = (a & b) + ((a ^ b) >> 1)
unsigned t_u = (unsigned)t
int avg = t + ( (t_u >> 31 ) & (a ^ b) )

I have checked it's correctness using Z3 SMT solver

0
chux - Reinstate Monica On

After accept answer (4 yr)

I would expect the function int average_int(int a, int b) to:
1. Work over the entire range of [INT_MIN..INT_MAX] for all combinations of a and b.
2. Have the same result as (a+b)/2, as if using wider math.

When int2x exists, @Santiago Alessandri approach works well.

int avgSS(int a, int b) {
  return (int) ( ((int2x) a + b) / 2);
}

Otherwise a variation on @AProgrammer:
Note: wider math is not needed.

int avgC(int a, int b) {
  if ((a < 0) == (b < 0)) {  // a,b same sign
    return a/2 + b/2 + (a%2 + b%2)/2;
  }
  return (a+b)/2;
}

A solution with more tests, but without %

All below solutions "worked" to within 1 of (a+b)/2 when overflow did not occur, but I was hoping to find one that matched (a+b)/2 for all int.


@Santiago Alessandri Solution works as long as the range of int is narrower than the range of long long - which is usually the case.

((long long)a + (long long)b) / 2

@AProgrammer, the accepted answer, fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == -2

a/2 + b/2 + (a%2 + b%2)/2

@Guy Sirton, Solution fails about 1/8 of the time to match (a+b)/2. Example inputs like a == 1, b == 0

int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a;

@R.., Solution fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == 1

return (a-(a|b)+b)/2+(a|b)/2;

@MatthewD, now deleted solution fails about 5/6 of the time to match (a+b)/2. Example inputs like a == 1, b == -2

unsigned diff;
signed mean;
if (a > b) {
    diff = a - b;
    mean = b + (diff >> 1);
} else {
    diff = b - a;
    mean = a + (diff >> 1);
}