I am writing a simple function to step through a range with floating step size. To keep the output neat, I wrote a function, correct
, that corrects the floating point error that is common after an arithmetic operation.
That is to say: correct(0.3999999999)
outputs 0.4
, correct(0.1000000001)
outputs 0.1
, etc.
Here's the body of code:
floats = []
start = 0
end = 1
stepsize = 0.1
while start < end:
floats.append(start)
start = correct(start + stepsize)
The output, in the end, looks as if it hasn't been corrected:
[0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7,
0.7999999999999999, 0.8999999999999999, 0.9999999999999999]
To check this, I inserted a print-statement to see what's being appended:
floats = []
start = 0
end = 1
stepsize = 0.1
while start < end:
print start
floats.append(start)
start = correct(start + stepsize)
And the output of these print-statements is:
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
So I know that the correct
function is working properly. Thus, it appears that when start
is appended to floats
, something happens in the assignment that causes the floating-point error to flare up again.
Things get worse. I assign the variable z
to the output, i.e. z = [0,0.1,...]
. I try print z
, which prints, as expected, the output that looks uncorrected:
[0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7,
0.7999999999999999, 0.8999999999999999, 0.9999999999999999]
Now I try for a in z: print a
. This prints out:
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
But this doesn't correspond to the output in print z
: one of them has floating-point error, the other doesn't.
Finally, I try print [correct(x) for x in z]
. I get the uncorrected-looking output again.
I hope I've made clear the confusion I'm experiencing. Why does printing the list look so different from printing each item individually? Why are variables in a list unaffected by the correct
function? I expect that this is due to some internal memory representation of floating point numbers in lists?
Finally, how can I ensure that the output is simply [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
?
I am using Python 2.7.9.
Short answer: your
correct
doesn't work.Long answer:
The binary floating-point formats in ubiquitous use in modern computers and programming languages cannot represent most numbers like 0.1, just like no terminating decimal representation can represent 1/3. Instead, when you write 0.1 in your source code, Python automatically translates this to 3602879701896397/2^55, a.k.a. 0.1000000000000000055511151231257827021181583404541015625, which can be represented in binary. Python is then faced with a question of how to display 0.1000000000000000055511151231257827021181583404541015625 when you print it.
Python has two ways to produce string representations of things:
str
andrepr
. In Python 2, the implementations ofstr
andrepr
make different choices for how to display floats. Neither produces an exact representation. Instead,str
truncates the string to 12 digits, which hides a bit of rounding error at the cost of displaying some very close floats as the same, whilerepr
truncates to 17 digits, enough that different floats will always display differently.When you
print
a float, Python usesstr
. This means that yourprint start
statements make it look like yourcorrect
works. However, when youprint
a list, the list's implementation ofstr
usesrepr
on its contents (for good reasons I won't go into here), soprint
ing the list shows that yourcorrect
didn't actually produce the numbers you wanted; if yourcorrect
had produced the number that0.3
in Python source code produces, then Python would have displayed it as0.3
.