Re: float -> double



wx wrote:
Is there a better way to cut the digits then going through string
representation? I don't want to lose any digits in the process.

When you cut the digits, you're losing digits. A direct float-to-double does not change the value.

Here's what's happening. Let us take a float with the hexadecimal representation 0x3f800001 (the smallest float greater than one). In decimal, that will be 1 + 2^-23. Logically speaking, this number will have an approximation error of 2^-23: it represents any number in the range (1 + 2^-24, 1 + 3 * 2^-24].

When converting to double, we get another 29 bits of precision, which means we can theoretically convert this number to any one of 536,870,912 possible representable numbers in the range. Which number should we choose? The most logical number is 1 + 2^-23--the same number we put in, just with another 29 bits of precision--so that it should now be represented as 1 + 2^-23 ± 2^-53.

That number is the same (modulo precision) as your original float.

Here's the catch. 1 + 2^-23 is exactly 1.00000011920928955078125.
The low end of our range is exactly 1.000000059604644775390625.
The upper end of our range is exactly 1.000000178813934326171875.

The exact value could be printed if you really wanted to, but there's no real point since the implicit error in the number dwarfs most of the latter digits. It's like saying I'm 72.166865346776554356456678 inches tall: it misleads you as to the precision of the answer (72 inches is the precise way of formatting).

Similarly, a float has a 23-bit mantissa, so its decimal representation is correct to a little more than 7 significant digits, 1.0000001 in this case (the low end rounds up). So Java will print 1.0000001, the number with the correct precision.

When I convert to a double, I get 29 more bits of precision, so the number becomes accurate to a little less than 16 significant digits. Java outputs accordingly: 1.0000001192092896.

When you did the string conversion, Java does the most precise possible representation of the number, which is now in decimal. 1/10 is a repeating decimal in binary, so converting the approximate value of 1.1 to binary, you get:

Float precision [*]
+-----------------------+
1.0011001100110011001100110011001100110011001100110011....
+----------------------------------------------------+
Double precision

What you just did was make an assumption about the original number, and in a bad way. Imagine if the conversion to decimal caused a round up--you've just now invalidated a few of your precious low-order bits. Indeed, the rounding makes the conversion slightly lossy: if the original number was ..86.., where the output truncates to ..9, it will be treated as ..900000000 instead of the "correct" ..86...

In short: float-to-double is always exact conversion. float-to-double-via-String will cause imprecision in later digits. Note that you've been comparing equality based on the not-fully-accurate String representation.

[*] Pedantic: it's going to be ...10 because of rounding.

--
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
.



Relevant Pages

  • Re: converting float to double
    ... >> insists that I read the stock prices as float. ... >> Since everywhere else the system uses double to hold these prices, ... A 64 bit integer will correctly model currency to 18 digits (with ... calcuations in 110 digits of precision (so that things like interest ...
    (comp.lang.c)
  • Re: F08 support of floating decimal?
    ... I don't look forward to decimal float. ... digit words (in any base and any other large numeber of digits) so ... binary to decimal will degrade an extended finite precision computation. ... representation in base sixty, but none in decimal. ...
    (comp.lang.fortran)
  • Re: Interesting math
    ... Floating point number represents a real number with 6 digits precision. ... Floating point numbers are denoted by the keyword float. ...
    (alt.usage.english)
  • Re: what is happening in C when increment this way?
    ... integers precisely (for float, this will typically be around +/- 2^24). ... total number of digits that can be represented remains roughly constant ... we're representing the number as a binary fraction. ... more precision, including a much larger range within which integers can ...
    (comp.lang.c)
  • Re: PEP 327: Decimal Data Type
    ... When you write a value with its precision specified in the form of an ... the existing float. ... The whole point of using an imprecise representation is because ... imprecision precisely means you need an infinitely precise numeric ...
    (comp.lang.python)