Re: bsr-algorithm?




I saw the discussion about the popcnt, and it reminded me of my own
search of a solution to this function:

/* first-bit position */
static int __fastcall fbpos(short int e) {
__asm {
/* handle '0'-case */
__asm xor eax, eax
__asm dec ax
__asm bsr ax, cx
};
}

but in MMX/XMMX.
I didn't find any slight dicussion of 'emulating' lzcnt or bsr at all,
and the solutions I could think of are very unsmart and probably
slower than sequencially transfering the data into GPRs and do the bsr
there.
I think on AltiVec helps the permutation-op, that we don't have on
x86.

Anybody got an idea about a branch-free 'emulation' of bsr?


Here it is - the code below treats input_vector as four 16-bit words
and performs BSR on them, leaving the results in mm0. For 0 values
it returns 0xffff, just like your fbpos function.

J.


const __int64 _0x8000800080008000 = 0x8000800080008000;
const __int64 _0x80ff80ff80ff80ff = 0x80ff80ff80ff80ff;
const __int64 _0x000f000f000f000f = 0x000f000f000f000f;
const __int64 _0x0003000300030003 = 0x0003000300030003;
const __int64 _0x0008000800080008 = 0x0008000800080008;
const __int64 _0x0004000400040004 = 0x0004000400040004;
const __int64 _0x0002000200020002 = 0x0002000200020002;
const __int64 _0x0001000100010001 = 0x0001000100010001;


__asm {
movq mm1, [ input_vector ]
pxor mm0, mm0
movq mm2, mm1
pxor mm1, [ _0x8000800080008000 ]
movq mm3, [ _0x0008000800080008 ]
pcmpeqw mm0, mm2
pcmpgtw mm1, [ _0x80ff80ff80ff80ff ]
pand mm3, mm1
pandn mm1, mm2
psrlw mm2, 8
paddw mm0, mm3
por mm1, mm2
movq mm3, [ _0x0004000400040004 ]
movq mm2, mm1
pcmpgtw mm1, [ _0x000f000f000f000f ]
pand mm3, mm1
pandn mm1, mm2
psrlw mm2, 4
paddw mm0, mm3
por mm1, mm2
movq mm3, [ _0x0002000200020002 ]
movq mm2, mm1
pcmpgtw mm1, [ _0x0003000300030003 ]
pand mm3, mm1
pandn mm1, mm2
psrlw mm2, 2
paddw mm0, mm3
por mm1, mm2
pcmpgtw mm1, [ _0x0001000100010001 ]
psubw mm0, mm1
}


.



Relevant Pages

  • Re: I cant download!!!
    ... On the three pc's I've upgraded from MM1 to MM2, ... left with one Movie Maker version and that's been MM2. ... > "Graham Hughes, MVP Digital Media" wrote: ...
    (microsoft.public.windowsxp.moviemaker)
  • AutoUpdate makes MovieMaker2 to Crash Computer
    ... My computer's video hardware worked fine with MovieMaker MM1 ... rolling-back MM2, I had to wait till there was a need to re-install XP. ... I avoided installing MM2 and used MM1. ...
    (microsoft.public.windowsupdate)
  • MM2 crashes computer
    ... My computer's video hardware worked fine with MovieMaker MM1 ... rolling-back MM2, I had to wait till there was a need to re-install XP. ... I avoided installing MM2 and used MM1. ...
    (microsoft.public.windowsxp.moviemaker)
  • Re: Problem with Movie Maker & Support
    ... Make backups of your MM1 database and project files before using them in MM2 ... > I recently upgraded to Movie Maker 2. ...
    (microsoft.public.windowsxp.moviemaker)
  • Re: Bit counting problem
    ... The magic behind Nils solution and a 64bit MMX version are explicited in the Athlon Optimization guide ... MOVQ MM1, MM0;v ...
    (borland.public.delphi.language.basm)