Re: Are f.p. manipulation functions only used in initialization?



"Charles Coldwell" <coldwell@xxxxxxxxx> wrote in message
news:m2wsmj1mh2.fsf@xxxxxxxxxxxx

"James Van Buskirk" <not_valid@xxxxxxxxxxx> writes:

Notably, on GNU/Linux (i.e. GNU glibc on Linux/x86)
/usr/include/fpu_control.h defines

#define _FPU_GETCW(cw) __asm__ __volatile__ ("fnstcw %0" : "=m" (*&cw))
#define _FPU_SETCW(cw) __asm__ __volatile__ ("fldcw %0" : : "m" (*&cw))

You can see why I gave my floating control word code in GAS: at
least the reader had some kind of chance to puzzle out what I
was doing. Easier IMO to look at Intels documentation such as

http://download.intel.com/design/processor/manuals/253666.pdf

and write to the processor in assembly.

and

/* precision control */
#define _FPU_EXTENDED 0x300 /*libm requires double extended precision. */
#define _FPU_DOUBLE 0x200
#define _FPU_SINGLE 0x0

Documented in

http://download.intel.com/design/processor/manuals/253665.pdf

The comment next to _FPU_EXTENDED should be written in a bold,
blinking, red font. You will screw up all of the libm routines (such
as cos, sin, exp, log) if you change the FPU precision. Since Fortran
intrinsics are usually (always?) implemented in terms of libm
routines, you will threfore screw up the corresponding intrinsics.

Um, well, not really. x86_64 is a different animal from x86, and
at least SSE2 is guaranteed on the former platform, so in principle
one could get by without any use of x87. If you look at the gnu docs
I quoted,

http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options

it says that -mpc80 is the default anyway, so at least the documentation
is in error here. In fact this whole page is garbage for Win64; I
don't know about other x86_64 platforms. For example the -mregparm=num
paragraph states that by default no parameters are passed in registers
whereas in fact four are in Win64 and lots more are in Linux x86_64.
See http://www.agner.org/optimize/calling_conventions.pdf for summary
information on default calling conventions on these platforms. This
page on gcc documentation needs a ?Redo from start.

About the only reason to change the FPU precision is if you are
writing your own libm, or software floating point (such as some of the
arbitrary precision work done by David H. Bailey and Yozo Hida), or
something similar.

You know that precision control only applies to addition, subtraction,
multiplication, division, and square root, right? Division and square
root can be accelerated by changing precision control, addition,
subtraction, and multiplication are pipelined operations that proceed
at the same pace (as measured by latency and throughput) regardless
of precision control. ISTR that LF95 messed with precision control
before doing a lot of square roots at single precision. It would
have been faster to do pipelined Goldschmidt iterations, but that
would have been too much fun :)

C:\gfortran\james\archpi>test_status
4000C90FDAA22168C000
4000C90FDAA22168C000
027F
4000C90FDAA22168C235
4000C90FDAA22168C236
037F

So you can see that gfortran isn't setting precision control to b'11'
even with the -mpc80 switch in effect so that real(10) arithmetic
isn't working until I set the x87 control word by hand.

I think what is happening here is an OS dependency. It must be that
gfortran/Win32 is linking to a Win32 math library that depends on
having the FPU in double precision, not double extended precision.
You might check to see what happens to some of the trigonometric
intrinsics after you diddle the FPU mode.

The x87 transcendental functions are documented to run at extended
precision independent of precision control so unless better argument
reduction than the hardware does is requied, precision control doesn't
matter for them. If fancy argument reduction is going to happen, the
overhead of changing the precision control is a drop in the bucket.

So perhaps the double extended precision can't work on Win32 because
either the FPU is in double extended precision and the math library is
borken, or the FPU is in single precision and real*10 is borken.

I think you mean double precision, not single precision above.
Real*8 would be broken if the FPU were in single precision mode
and the software actually used the FPU for double precision
operations, which for the most part is doesn't do in x86_84
mode. I'm not sure what's going on here, but as you have noted
the value of the precision control bits is contrary to documentation
both of us have been able to find and this is what is preventing
real*10 operations (which I can't find much documentation about
in the gfortran manual) from working in any kind of expected
fashion. I don't know if there is something down there in gcc
which requires setting pc = b'10' so that it is intentionally set
up that way (like MicroSoft libraries) or if it's just an accident
that can be fixed by setting pc = b'11' as I did in my example.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


.



Relevant Pages

  • Re: How to use FPU 80bit FP
    ... the precision bits in this newsgroup. ... INTEL FPU FLOATING POINT FORMATS ... FPU CONTROL WORD ... because stack faults should never be masked, ...
    (comp.lang.asm.x86)
  • Re: most shock absorbing inserts?
    ... > means that actually precise control *IS* important. ... Barry Publow working on technique doing lots of exercises, ... angle of the skate to anything like a precision of plus-minus two degrees. ...
    (rec.sport.skating.inline)
  • Help me, with MSComm!!
    ... I need a example or documentation of this control in VB.NET, ... I am making a project in VB.NET that reads data of a balance of ... precision by port RS232, as I store that weight in a variable? ...
    (microsoft.public.dotnet.languages.vb)
  • Re: critical floating point incompatibility
    ... this value sets the precision ... point control word on the AMD64 is 0x37F. ... environment to the i386 default for i386 binaries. ... struct pcb *pcb; ...
    (freebsd-hackers)
  • Re: quieries related to math handling in micro controllers
    ... handle floating point numbers to a great precision. ... Need to learn how to apply control theory in your embedded system? ...
    (comp.arch.embedded)