Re: bsr-algorithm?
- From: Terje Mathisen <spamtrap@xxxxxxxxxx>
- Date: Wed, 21 May 2008 21:33:48 +0200
spamtrap wrote:
Hy;
I saw the discussion about the popcnt, and it reminded me of my own
search of a solution to this function:
/* first-bit position */
static int __fastcall fbpos(short int e) {
but in MMX/XMMX.
I didn't find any slight dicussion of 'emulating' lzcnt or bsr at all,
and the solutions I could think of are very unsmart and probably
slower than sequencially transfering the data into GPRs and do the bsr
there.
I think on AltiVec helps the permutation-op, that we don't have on
x86.
Anybody got an idea about a branch-free 'emulation' of bsr?
Sure!
Getting the most significant bit is easy in SSE, as long as you don't need to work with integers larger than 2^24, or can stand the overhead of converting to double first:
Convert the 4x32-bit (i.e. 4x24-bit) integers into 4xfloat, then treat the result as integers, extracting the exponent field and subtracting 127 to get the most significant bit position.
If you have to handle zero, then you'll have to add a compare against zero either before or after the conversion, then use that as a mask to POR on top of the regular result, so that you'll return -1 for zero inputs.
Branchless, maybe 10 instructions, mostly fast opcodes.
OK?
Terje
--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"
.
- Follow-Ups:
- Re: bsr-algorithm?
- From: spamtrap
- Re: bsr-algorithm?
- References:
- bsr-algorithm?
- From: spamtrap
- bsr-algorithm?
- Prev by Date: bsr-algorithm?
- Next by Date: Re: bsr-algorithm?
- Previous by thread: bsr-algorithm?
- Next by thread: Re: bsr-algorithm?
- Index(es):
Relevant Pages
|