Re: COMPARE HLL/ASM
- From: "Wolfgang Kern" <nowhere@xxxxxxxx>
- Date: Wed, 26 Dec 2007 15:21:08 +0100
Wannabee skrev:
Ok, here is the whole story yet:
It's not optimised at all and there might be faster algos too,
but it contains not a single branch and uses only four GP-regs,
so its timing isn't value dependend.
I got ~38 cycles on AMD K7 under KESYS (cache aligned, prefetched),
and an average of ~150 cycles on AMD64 with XP-home.
It may be a bit slower on Intel CPUs because of the Shifts.
As I saw in the other post right now you meant a 64 bit result,
so I expanded my first version to 128->64 bits it now shows
85 cycles on the KESYS-K7 and ~300 with windoze.
Seems this M$-stuff got heavy cache-issues ;)
So I tried also a short code version and what a surprise it now
takes again ~150 cycles per pass with windoze (~95 with mine).
I think to measure just all the cache miss penalties and the time
our code takes is a minor factor on windoze.
Hope you can get better figures with Linux.
I timed the first one to 152 cycles on win2000.
The second one is 500+ cycles here... (K7)
Is there a faster way to "pack" those BCD
numbers? Are there MMX or SSE instructions that
does it any faster? Like for instance convert the whole
thing at once? (I was looking but sofar I cant find the once I want
but it seems it got instructions dealing with "(un)packed" bcd..?)
I've seen combinations of:
MASKMOVDQU PACKUSBW PMINUB/PMAXUB PSADBW (a bit detouring, so no gain)
What I found really usable from SSE/XMM were the PADD..PXOR group,
but me too miss a PSHUFUB instruction.
Anyway, thanks for the code, I keep it and try to learn from it.
:) be aware that this code examples are just fast typed hacks ...
I actually just reversed the functionality of 'our' 48 Cycle Test.
__
wolfgang
.
- References:
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- From: //\\\\o//\\\\annabee
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- From: santosh
- COMPARE HLL/ASM
- From: Wolfgang Kern
- Re: COMPARE HLL/ASM
- From: Wolfgang Kern
- Re: COMPARE HLL/ASM
- From: //\\\\o//\\\\annabee
- Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- Prev by Date: Re: Which assembler (or compiler) to start with? (newbie question)
- Next by Date: Re: Which assembler (or compiler) to start with? (newbie question)
- Previous by thread: Re: COMPARE HLL/ASM
- Next by thread: Re: Is PSHUFW instruction MMX or SSE or SSE2? Is NASM manual correct?
- Index(es):
Relevant Pages
|