Re: Floating point subtraction rounding error (NOT display error)



En Fri, 14 Dec 2007 00:22:18 -0300, Keflavich <keflavich@xxxxxxxxx> escribió:

On Dec 13, 5:52 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:
On Thu, 13 Dec 2007 14:30:18 -0800, Keflavich wrote:
> Hey, I have a bit of code that died on a domain error when doing an
> arcsin, and apparently it's because floating point subtraction is having
> problems.

I'm not convinced that your diagnosis is correct. Unless you're using
some weird, uncommon hardware, it's unlikely that a bug in the floating
point subtraction routines has escaped detection. (Unlikely, but not
impossible.) Can you tell us what values give you incorrect results?

Here's a better (more complete) example [btw, I'm using ipython]:

In [39]: x = 3.1 + .6
In [40]: y = x - 3.1
In [41]: y == 6
Out[41]: False
In [42]: x == 3.7
Out[42]: True
In [43]: 3.1+.6-3.1 == .6
Out[43]: False
In [45]: (3.1+.6-3.1)/.6
Out[45]: 1.0000000000000002
In [46]: (3.1+.6-3.1)/.6 == 1
Out[46]: False
In [47]: (3.1+.6-3.1)/.6 > 1
Out[47]: True

Let's see how these float numbers are represented: significand * 2**exponent where 1<=significand<2 (like scientific notation but using base 2, not base 10) and with 53 binary digits available for the significand (this is more or less the format used by IEEE754 floating point, implemented in hardware on all platforms where Python is available and I know of, except maybe some cell phones):

3.1 = 3.1000000000000001 = 0.77500000000000002 x 2**2 = 1.1000110011001100110011001100110011001100110011001101 x 2**1

0.6 = 0.59999999999999998 = 0.59999999999999998 x 2**0 = 1.0011001100110011001100110011001100110011001100110011 x 2**-1

3.1+0.6 = 3.7000000000000002 = 0.92500000000000004 x 2**2 = 1.1101100110011001100110011001100110011001100110011010 x 2**1

3.1+0.6-3.1 = 0.60000000000000009 = 0.60000000000000009 x 2**0 = 1.0011001100110011001100110011001100110011001100110100 x 2**-1

where the 1.xxxx are binary fractions.

Let's compute 3.1+0.6 using the binary form:

3.1 = 11.000110011001100110011001100110011001100110011001101
0.6 = .100110011001100110011001100110011001100110011001100 11
sum = 11.101100110011001100110011001100110011001100110011010

Notice that, to align both numbers at their decimal point (hmm, binary point!), we had to discard the two less significant binary digits of 0.6. The result, however, is still the same as if we had done the computation exactly and then rounded to 53 significant binary digits. (that's why x == 3.7 is True in your example above).

Now let's substract 3.1 from the result:
sum = 11.101100110011001100110011001100110011001100110011010
3.1 = 11.000110011001100110011001100110011001100110011001101
dif = .100110011001100110011001100110011001100110011001101 __

There are 51 significant bits there; as we use 53 bits in the representation, the two less significant bits are set to 0 (they're "unknown", in fact). We lose those 2 bits because we are substracting numbers close to each other (this is known as cancellation). The error is only 1 unit of the least significant binary digit (1x2**-53) but enough to make that number different to the representation of 0.6 (they differ by the least possible amount).

Note that this is the best result you can get with a limited precision of 53 bits, and it's not even Python who computes the value, but the underlying C math library (and very likely, using the hardware). IEEE754 defines substraction very precisely so people can get reliable results on any platform, and this is not an exception.

> I know about the impossibility of storing floating point
> numbers precisely, but I was under the impression that the standard used
> for that last digit would prevent subtraction errors from compounding.

What gave you that impression? Are you referring to guard digits?

I believe that's what I was referring to; I have only skimmed the
topic, haven't gone into a lot of depth

Substraction of numbers of similar magnitude is often a problem, like addition of very dissimilar numbers.

I should also mention that of course your answer will deviate, due to the
finite precision of floats.

I'm adding and subtracting things with 1 decimal point; I can do the
math for an given case trivially. Are you suggesting that I don't
know what the exact floating point quantity should be?

Notice that those values have an exact *decimal* representation but are *periodic* written in binary form, as you can see from above. Any finite binary representation must be approximate then. It's like trying to write 1/3 in decimal form, you can't do that exactly without requiring infinite digits.

Have you read this?http://docs.sun.com/source/806-3568/ncg_goldberg.html

I started to, but didn't get through the whole thing. I'm saving that
for bedtime reading after finals.

At least overview the topics - you might not be interested in the theorem proofs and some details, but the consequences and discussion are worth reading.

Anyway, whether or not it's stated in that document, as I presume it
is, the problem I have resulted from a different level of precision
for numbers <1 and numbers >1. i.e.
1.1000000000000001
.90000000000000002
so the larger the number (in powers of ten), the further off the
difference between two numbers will be. I suppose that's why the
"guard digits" did nothing for me - I subtracted a "real number" from
a guard digit. Still, this seems like very unintuitive behavior; it
would be natural to truncate the subtraction at the last digit of
1.1... rather than including the uncertain digit in the final answer.

Well, this is what "floating" point means: the number of significant digits after the decimal point is not fixed. You appear to want a "fixed point" approach; for some applications fixed point is better suited (money accounting, where 4 decimal places are usually enough) but for general, wide range numbers, floating point is better.

--
Gabriel Genellina

.



Relevant Pages

  • Re: float bug? perl 5.8, DBI and oracle 10.2.0
    ... precision numbers in oracle, you've got 38 decimal digits to play ... and with minimal coaxing perl will handle them as ... digits from a 32 bit floating point number - I'll go out on a limb ... and hazard that one can expect 12 or so digits from a 64 bit floating ...
    (perl.dbi.users)
  • Re: Linear Algebra Challenge
    ... Since I'm using floating point, so I'll never be able to calculate one ... floating point math set to 99 digits. ... As close as I'm willing to wait if I use arbitrary precision. ... This mode is fast; when you select arbitry ...
    (comp.sys.hp48)
  • Re: Interesting math
    ... Floating point number represents a real number with 6 digits precision. ... Floating point numbers are denoted by the keyword float. ...
    (alt.usage.english)
  • Re: Interesting math
    ... Floating point number represents a real number with 6 digits precision. ... Floating point numbers are denoted by the keyword float. ...
    (alt.usage.english)
  • Re: How do you round off a float?
    ... o Floating point formats are almost always in binary, ... binary digits is very different from rounding to a certain number of decimal ... o A corollary to the above rule is use the maximum precision available on ...
    (comp.lang.cpp)