A discussion came up on Guido von Rossum's Google Plus post. It comes down to the fact that 2.1 is not exactly represented as a floating point number. Internally it's 2.0999999999999996, and this causes unexpected behavior.

These kinds of issues often come up. The confusion is caused by treating floating point numbers as exact numbers, and expecting calculations with them to produce results meaningful down to the last bit.

It is not in any way "solvable", at least not by means accessible to us (which in some sense defines it as "not a problem"). Depends too much on alignment-handling vagaries of MKL libraries, ordering of operations in BLAS, and usage, or not, of extended precision registers. See also IEEE 754: a careful reading may shed light on how different results for the same computation can arise in compliant hardware/software, even on the same machine

In addition to ambiguity in IEEE 754, which gives slight differences between different floating point libraries, there's an issue of the same library giving slightly different results on reruns with same inputs and and same machine, because of run-time optimization by the processor

The solution that Wolfram Inc came up with is to treat the last 7 bits of IEEE doubles as unknown. When testing for equality, those bits are ignored. When printing a number, it chooses representation that gives a nice printout. For instance, Print[2.0999999999999996] will display 2.1

So here's the rule of thumb for IEEE 754 floating point numbers:

When checking for equality of floating point doubles, ignore the last 7 bits of the mantissa.

## 1 comment:

Just in case you haven't seen it yet:

http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html

Post a Comment