Re: filling big array of double



Robert Houdart wrote:
Test result on P4 2.8 GHz HT:
- Pascal: 1469 ms
- Aleksandr's last IA32: 1468 ms
- Eric's FPU: 4282 ms

The version using FPU is nearly 3 times slower...

Ouch, I had memories that the P4 FPU was slow, but I didn't think it would be so slow on mere load/store ops.
On Athlon, FPU memory transfers are comparable or often slightly faster than MMX ones ^_^


Eric
.



Relevant Pages

  • Re: filling big array of double
    ... > The version using FPU is nearly 3 times slower... ... This is worse than Eric's ASM FPU version and his own MMX version. ... Prev by Date: ...
    (borland.public.delphi.language.basm)
  • Re: HT optimization
    ... If all your threads do a similar work, you can gain if they are periodically stalling on memory accesses or if they alternate FPU and integer sections (and HT can be slower than no HT if they are not). ...
    (borland.public.delphi.language.basm)
  • Re: filling big array of double
    ... Test result on P4 2.8 GHz HT: ... - Aleksandr's last IA32: 1468 ms ... The version using FPU is nearly 3 times slower... ... Prev by Date: ...
    (borland.public.delphi.language.basm)