fpu code optimisation request
From: PatD (spamtrap_at_crayne.org)
Date: 07/07/04
- Next message: Stephen Sprunk: "Re: fpu code optimisation request"
- Previous message: Brandy Sullivan: "Re: Oh God, what have I done? DEBUG.EXE"
- Next in thread: Stephen Sprunk: "Re: fpu code optimisation request"
- Maybe reply: Stephen Sprunk: "Re: fpu code optimisation request"
- Reply: Robert Redelmeier: "Re: fpu code optimisation request"
- Reply: Matt Taylor: "Re: fpu code optimisation request"
- Reply: Ivan Korotkov: "Re: fpu code optimisation request"
- Reply: PatD : "Re: fpu code optimisation request"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 7 Jul 2004 19:32:08 +0000 (UTC)
Consider the following pseudo-code function:
function myAND(float a, float b)
if a = 0 then test_a = 0 else test_a = 1
if b = 0 then test_b = 0 else test_b = 1
result = test_a AND test_b
Or, in "normal" language, zero stays zero,
anything else becomes 1.
Compute the logical "and" of these two booleans.
(i.e. return "[0|1] and [0|1]").
All numbers are 64bit floats.
Carefull (sort of) thinking lead to the
following FPU code:
fld1 ; load 0 and 1 for later...
fldz
fld1
fldz
fld qword [someplace] ; get first value
fcomip st1 ; compare to 0 and "forget" it
fcmove st0,st1 ; st0 = 0 if [someplace] = 0, 1 otherwise
fxch ; cleanup
fstp st0
fld1
fldz
fld qword [someplace + 8] ; idem for value 2
fcomip st1
fcmove st0,st1
fxch
fstp st0
; ..."later" starts here:
fcomip st1 ; compare the 0/1 of values 1 and 2
fcmove st0,st2 ; result is either 0 or 1
fcmovne st0,st1
fxch st0,st2 ; heavy cleaning
fcompp
; final result in st0
Now, this code has two rather interesting properties:
1. it works :)
2. it has no "jmp"s
However, it sort of looks sub-optimal,
and, well, I was hoping that someone here
would see some improvement to it.
(Less code, something faster, ...)
FWIW, the final code will have to run on P3s.
Any comments and suggestions welcome
cu
P
- Next message: Stephen Sprunk: "Re: fpu code optimisation request"
- Previous message: Brandy Sullivan: "Re: Oh God, what have I done? DEBUG.EXE"
- Next in thread: Stephen Sprunk: "Re: fpu code optimisation request"
- Maybe reply: Stephen Sprunk: "Re: fpu code optimisation request"
- Reply: Robert Redelmeier: "Re: fpu code optimisation request"
- Reply: Matt Taylor: "Re: fpu code optimisation request"
- Reply: Ivan Korotkov: "Re: fpu code optimisation request"
- Reply: PatD : "Re: fpu code optimisation request"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|