Re: bsr-algorithm?
- From: spamtrap <spamtrap@xxxxxxxxxx>
- Date: Wed, 21 May 2008 17:45:22 -0700 (PDT)
Anybody got an idea about a branch-free 'emulation' of bsr?
Sure!
Getting the most significant bit is easy in SSE, as long as you don't
need to work with integers larger than 2^24, or can stand the overhead
of converting to double first:
Convert the 4x32-bit (i.e. 4x24-bit) integers into 4xfloat, then treat
the result as integers, extracting the exponent field and subtracting
127 to get the most significant bit position.
If you have to handle zero, then you'll have to add a compare against
zero either before or after the conversion, then use that as a mask to
POR on top of the regular result, so that you'll return -1 for zero inputs.
Branchless, maybe 10 instructions, mostly fast opcodes.
OK?
Uff, that's crazy ... you will think that too, if you know for what I
actually use the function: short <-> half-float <-> float <-> long
conversion, for transcoding and creating the exponent.
What you are actually suggesting is to treat a 16bit half-float as a
short, convert it's exponent into a float, and extracting the
exponent, to know which is the first-exponent bit set on the half-
float.
I suppose it works and is possibly even fast enough, I'd hoped for a
pure integer-algorithm though.
Thanks!
Niels
.
- Follow-Ups:
- Re: bsr-algorithm?
- From: Terje Mathisen
- Re: bsr-algorithm?
- References:
- bsr-algorithm?
- From: spamtrap
- Re: bsr-algorithm?
- From: Terje Mathisen
- bsr-algorithm?
- Prev by Date: Re: bsr-algorithm?
- Next by Date: Re: This Ain't No DOS - The BIOS Is Back - aeBIOS Int 13h proposed extension
- Previous by thread: Re: bsr-algorithm?
- Next by thread: Re: bsr-algorithm?
- Index(es):
Relevant Pages
|