Exact solutions for lib(ic)

278 views Asked by At

Using ECLiPSe Prolog's lib(ic) I stumbled upon the following problem from David H. Bailey, "Resolving numerical anomalies in scientific computation." which I was referred to by the Unum book. Actually, it is only part of it. First, let me formulate the equation in terms of (is)/2. Also, please note that all these decimal numbers have an exact representation in radix 2 floats (which comprises IEEE):

ECLiPSe Constraint Logic Programming System [kernel]
...
Version 6.2development #21 (x86_64_linux), Wed May 27 20:58 2015
[eclipse 1]: lib(ic).
...
Yes (0.36s cpu)
[eclipse 2]: X= -1, Y = 2, Null is 0.80143857*X+1.65707065*Y-2.51270273.

X = -1
Y = 2
Null = 0.0
Yes (0.00s cpu)

So this is truly 0.0 (no rounding at all). But now the same with $= in place of is:

[eclipse 3]: X= -1, Y = 2, Null $= 0.80143857*X+1.65707065*Y-2.51270273.

X = -1
Y = 2
Null = 2.2204460492503131e-16__2.2204460492503131e-16
Yes (0.00s cpu)

This interval does not contain 0.0. I am aware that interval arithmetic often is a bit too approximate as in:

[eclipse 4]: 1 $= sqrt(1).

Delayed goals:
    0 $= -1.1102230246251565e-16__2.2204460492503131e-16
Yes (0.00s cpu)

But at least the equation holds! However, in the first case zero is no longer included. Evidently I have not understood something. I tried also eval/1 but to no avail.

[eclipse 5]: X= -1, Y = 2, Null $= eval(0.80143857*X+1.65707065*Y-2.51270273).

X = -1
Y = 2
Null = 2.2204460492503131e-16__2.2204460492503131e-16
Yes (0.00s cpu)

What is the reason for Null not including 0.0?


(Edit after @jschimpf's surprising answer)

Here is the quotation from the book page 187 that I interpreted as meaning that the numbers are represented exactly (now stroked through).

Use a {3,5}, environment, the one that can simulate IEEE single precision. The input values are exactly representable. ...
{-1, 2}
...
That did the job, computing the exact answer with fewer than half the bits used by ...

Otherwise the statement page 184 holds:

...

0.80143857 x + 1.65707065 y = 2.51270273

The equations certainly look innocent enough. Assuming exact decimal inputs, this
system is solved exactly by x = -1 and y = 2.

Here is it rechecked with SICStus' library(clpq):

| ?- {X= -1,Y=2,
     A = 80143857/100000000,
     B = 165707065/100000000,
     C = 251270273/100000000,
     Null = A*X+B*Y-C}.
X =  -1,
Y = 2,
A = 80143857/100000000,
B = 33141413/20000000,
C = 251270273/100000000,
Null = 0 ? 
yes

So -1, 2 are the exact solutions.


A precise formulation

Here is a reformulation that does not have rounding problems in the input coefficients, still the solution is just -∞...+∞. Thus trivially correct, but not useable.

[eclipse 2]: A = 25510582, B = 52746197, U = 79981812, 
C = 80143857, D = 165707065, V = 251270273,
A*X+B*Y$=U,C*X+D*Y$=V.

A = 25510582
B = 52746197
U = 79981812
C = 80143857
D = 165707065
V = 251270273
X = X{-1.0Inf .. 1.0Inf}
Y = Y{-1.0Inf .. 1.0Inf}


Delayed goals:
    52746197 * Y{-1.0Inf .. 1.0Inf} + 25510582 * X{-1.0Inf .. 1.0Inf} $= 79981812
    80143857 * X{-1.0Inf .. 1.0Inf} + 165707065 * Y{-1.0Inf .. 1.0Inf} $= 251270273
Yes (0.00s cpu)
1

There are 1 answers

2
jschimpf On BEST ANSWER

Several issues conspire here to create confusion:

  1. other than claimed, the three constants in the example do not have exact representations as double floats.

  2. it is not true that the initial example involves no rounding.

  3. the seemingly correct result in the first example is actually due to a fortunate rounding error. Other computation orders give different results.

  4. the exact result, given the nearest double float representation of the constants, is indeed not zero but 2.2204460492503131e-16.

  5. interval arithmetic can only give accurate results when the inputs are accurate, which isn't the case here. The constants have to be widened into intervals that include the desired decimal fraction.

  6. relational arithmetic like the one offered by lib(ic) does by nature not guarantee a particular evaluation order. For that reason the rounding errors can differ from the ones encountered during functional evaluation. The results will however be accurate with respect to the given constants.

The following goes into a bit more detail. As I will demonstrate some points using ECLiPSe queries, a quick word beforehand about the syntax:

  • two floats separated by a double underscore, such as 0.99__1.01 denote an interval constant with lower and upper bound, in this case a number in the vicinity of 1.

  • two integers separated by a single underscore, such as 3_4 denote a rational constant with numerator and denominator, in this case three quarters.

To demonstrate point (1), convert the float representation of 0.80143857 into a rational. This gives the precise fraction 3609358445212343/4503599627370496, which is close, but not identical, to the intended decimal fraction 80143857/100000000. The floating point representation is therefore not exact:

?- F is rational(0.80143857), F =\= 80143857_100000000.
F = 3609358445212343_4503599627370496
Yes (0.00s cpu)

The following shows how the result depends on the evaluation order (point 3 above; note that I have simplified the original example by getting rid of the irrelevant multiplications):

?- Null is -0.80143857 + 3.3141413 - 2.51270273.
Null = 0.0
Yes (0.00s cpu)

?- Null is -2.51270273 + 3.3141413 - 0.80143857.
Null = 2.2204460492503131e-16
Yes (0.00s cpu)

The order dependence proves that rounding errors occur (point 2). For those familiar with floating point operations, it is in fact easy to see that when adding -0.80143857 + 3.3141413, two bits of precision from 0.80143857 get lost while adjusting the exponents of the operands. In fact it is this lucky rounding error that gives the OP his seemingly correct result!

In reality, the second result is more accurate with respect to the floating point representations of the constants. We can prove this by repeating the computations using precise rational arithmetic:

?- Null is rational(-0.80143857) + rational(3.3141413) - rational(2.51270273).
Null = 1_4503599627370496
Yes (0.00s cpu)

?- Null is rational(-2.51270273) + rational(3.3141413) - rational(0.80143857).
Null = 1_4503599627370496
Yes (0.00s cpu)

As the additions are done with precise rationals, the result is now order-independent, and because 1_4503599627370496 =:= 2.2204460492503131e-16, this confirms the non-zero floating point result obtained above (point 4).

How can interval arithmetic help here? It works by computing with intervals that enclose the true value, such that results will always be accurate with respect to the inputs. So it is crucial to have input intervals (bounded reals in ECLiPSe terminology) that enclose the desired true value. These can be obtained by either writing them down explicitly, such as 0.80143856__0.80143858; by converting from a precise number, for instance a rational using breal(80143857_100000000); or by instructing the parser to automatically widen all floating point numbers into bounded real intervals, as follows:

?- set_flag(syntax_option, read_floats_as_breals).
Yes (0.00s cpu)

?- Null is -0.80143857 + 3.3141413 - 2.51270273.
Null = -8.8817841970012523e-16__1.3322676295501878e-15
Yes (0.00s cpu)

?- Null is -2.51270273 + 3.3141413 - 0.80143857.
Null = -7.7715611723760958e-16__1.2212453270876722e-15
Yes (0.00s cpu)

Both results now contains zero, and it becomes apparent how the precision of the result depends on the evaluation order.