Getting around floating point error with logarithms?

1.2k views Asked by At

I'm trying to write a basic digit counter (an integer is inputted and the number of digits of that integer is outputted) for positive integers. This is my general formula:

dig(x) := Math.floor(Math.log(x,10))

I tried implementing the equivalent of dig(x) in Ruby, and found that when I was computing dig(1000) I was getting 2 instead of 3 because Math.log was returning 2.9999999999999996 which would then be truncated down to 2. What is the proper way to handle this problem? (I'm assuming this problem can occur regardless of the language used to implement this approach, but if that's not the case then please explain that in your answer).

3

There are 3 answers

1
Andrey Mishchenko On

To get an exact count of the number of digits in an integer, you can do the usual thing: (in C/C++, assuming n is non-negative)

int digits = 0;
while (n > 0) {
  n = n / 10; // integer division, just drops the ones digit and shifts right
  digits = digits + 1;
}

I'm not certain but I suspect running a built-in logarithm function won't be faster than this, and this will give you an exact answer.

I thought about it for a minute and couldn't come up with a way to make the logarithm-based approach work with any guarantees, and almost convinced myself that it is probably a doomed pursuit in the first place because of floating point rounding errors, etc.

2
randomusername On

From The Art of Computer Programming volume 2, we will eliminate one bit of error before the floor function is applied by adding that one bit back in.

Let x be the result of log and then do x += x / 0x10000000 for a single precision floating point number (C's float). Then pass the value into floor.

This is guaranteed to be the fastest (assuming you have the answer in numerical form) because it uses only a few floating point instructions.

3
keshlam On

Floating point is always subject to roundoff error; that's one of the hazards you need to be aware of, and actively manage, when working with it. The proper way to handle it, if you must use floats is to figure out what the expected amount of accumulated error is and allow for that in comparisons and printouts -- round off appropriately, compare for whether the difference is within that range rather than comparing for equality, etcetera.

There is no exact binary-floating-point representation of simple things like 1/10th, for example.

(As others have noted, you could rewrite the problem to avoid using the floating-point-based solution entirely, but since you asked specifically about working log() I wanted to address that question; apologies if I'm off target. Some of the other answers provide specific suggestions for how you might round off the result. That would "solve" this particular case, but as your floating operations get more complicated you'll have to continue to allow for roundoff accumulating at each step and either deal with the error at each step or deal with the cumulative error -- the latter being the more complicated but more accurate solution.)

If this is a serious problem for an application, folks sometimes use scaled fixed point instead (running financial computations in terms of pennies rather than dollars, for example). Or they use one of the "big number" packages which computes in decimal rather than in binary; those have their own round-off problems, but they round off more the way humans expect them to.