Formula to convert single precision from binary to decimal

73 views Asked by At

I need to show that a single precision format, which contains 24 total bits of precision, is equivalent to about 7 decimal digits of precision.

The expression I found was simply log10(224) = 7.225, which yields the expected results of 7 digits of precision, however, I can't find a formula proof of this expression or intuition on how it works

2

There are 2 answers

0
chux - Reinstate Monica On

intuition on how it works

For common float:

Between [1.0... 2.0) there are 223 different float - evenly spaced.
Between [2.0... 4.0) there are 223 different float - evenly spaced.
Between [4.0... 8.0) there are 223 different float - evenly spaced.

Between [8.0...10.0) there are 1/4 * 223 different float - evenly spaced.

Between [4.0... 8.0) there are 1/2 223 different float spaced as far apart as those in [8.0...10.0).
Between [2.0... 4.0) there are 1/4 223 different float spaced as far apart as those in [8.0...10.0).
Between [1.0... 2.0) there are 1/8 223 different float spaced as far apart as those in [8.0...10.0).

Between [1.0...10.0) there are 9/8 223 different float spaced as far apart as those in [8.0...10.0) or 9,437,184.

Between [1.0...10.0) there are 9,000,000 different 7-decimal digit values space 0.000 001 apart.

Since we have 9,437,184 evenly spaced float to encode more than 9,000,000 different values, we can claim "about 7 decimal digits of precision".

Different decade ranges will result in similar "about 7 decimal digits".

IIRC, the worst case decade is just a tad less than "7 decimal digits of precision" (about 6.92), perhaps in the range [1,000,000...10,000,000).


OP's log10(224) = 7.225 is a good first step. Yet float values are distributed linearly per powers of 2. To compare with decimal precision, we want to see the distribution among various powers of 10.

The C spec has FLT_DIG*1 for binary float as below which is close to OP's goal.

p = 24; // 24 digits
b = 2   // base 2
q = floor((p − 1) * log10(b))
q = floor(6.923...)
q = 6

*1 FLT_DIG
number of decimal digits, q, such that any floating-point number with q decimal digits can be rounded into a floating-point number with p radix b digits and back again without change to the q decimal digits

0
vinc17 On

There is no equivalency. This depends on what you really want. The problem is often the following one: You have a p-digit number in radix b, and you want to convert it to a P-digit number in radix B (to the nearest value) in such a way that if you read back the value (i.e. do the inversion conversion), you get the initial value. This problem was solved by David Matula in 1968: In-and-Out Conversions (freely available). If one radix is not a power of the other one, the formula is P = 1 + ⌈p·log(b)/log(B)⌉ (note that P may be larger, but this formula gives the minimal value for which this always works).

For 2 and 10, one can apply this formula. So, if you want to be able to convert a single-precision value (b = 2, p = 24) to decimal in order to get back this value with the inverse conversion to single precision, you need 1 + ⌈24·log(2)/log(10)⌉ = 9 digits. Conversely, a 6-digit decimal number can be converted to single precision and back, because 1 + ⌈6·log(10)/log(2)⌉ = 21 ⩽ 24, but you will not always get the initial value with a 7-digit decimal number, because 1 + ⌈7·log(10)/log(2)⌉ = 25 > 24.