Re: storing an integer in a double precision

On May 5, 2:35 pm, Lynn McGuire <l...@xxxxxxxxxx> wrote:
On 5/5/2011 12:08 PM, Richard Maine wrote:

Lynn McGuire<l...@xxxxxxxxxx>  wrote:

Is there a rule of thumb for the biggest integer that
I can store in a double precision variable without losing
the integer value due to round off ?  BTW, I use a F77

No particular "rule of thumb". Just look at the particular
representations. It doesn't have much to do with the version of Fortran,
but with the physical representation used for double precision (and, to
a lesser extent for integer, except that you won't run into any machines
that store integers in other than binary). Heck, it barely even has to
do with Fortran. (A little, but barely; the little has to do with how
Fortran compilers could select from different physical representations
supported by the hardware).

Look at how many bits are in the mantissa of the representation for
double. That's about how large an integer you could store without
roundoff. If you want the exact number, you have to look more carefully
and consider things like hidden bits (and on old IBM mainframes,
exponent radix). But for a rough approximation, just look at the number
of bits in the mantissa.

Most compilers these days use IEEE double, which has, if I recall
correctly, 53 bits in the mantissa. So your answer would be somewhere
around 2**53.

But, do double precision variables actually store 32 bit integers
that were converted but without roundoff ?  So roundoff only comes
into play for whole numbers greater than 52 bits ?


Hmmm... I think you can make a little program that checks every
possibility (I like brute force whenever I can use it)... or I'm
completely lost...

Sorry I can't write directly in Fortran, but in pseudocode; assuming
32 bit signed integers:
Integer :: ii, idi
Double Precision :: di

DO i = -2^31, 2^31-1
di "=" i
idi "=" di
if (ii /= idi)
I'm using "=" because I don't know right now how to make the
"translation" ("cast", I think, is the C wording), and I don't have
time to check.

If I did the math ok, it would be about 4 Gop-flop/s, so it should not
be a too long runtime...

If that's ok, and somebody makes an actual Fortran program with that,
I would like to see it working.

Please tell me if I'm wrong.