Re: Count Leading Zeros (cntlzw)



> > I wanted to use this function -which is part of the PowerPC assembly
> > language- in a x86 architecture (prefered an Athlon).
> > This function calculates the leading zeros of a word.
> > I need this function to be really fast, so i thought to ask the
> > basm group :)
> >
> > How should it be best implemented in order to be best optimised?
> >
> > Thanks in advance
> > Kostas
> >
>
> 80x86 have BSF (Bit Scan Forward) and BSR (B.S. Reverse) functions

BSR is the right one to use. I think the code should be something like
this (untested):

function Cntlzw(Value: Word): Word;
asm
bsr dx, ax
jz @Zero

mov ax, 15
sub ax, dx

@Zero:
mov ax, 16
end;

CMOV instead of the branch could provide more performance, but that may
depend on how often you get Value = 0 (and, of course, on the CPU).
.



Relevant Pages

  • Re: which book to start with...?
    ... mov eax, 4 ... just installed nasm 16 bit and 32 bit bins under dosemu. ... .bss wont accept initialisations while .data will but no garantee for modification at runtime. ... Section .bss is nominally "uninitialized" data, but is in fact cleard to zero. ...
    (alt.lang.asm)
  • Re: which book to start with...?
    ... mov eax, 4 ... buffer resb 1000h ... Section .bss is nominally "uninitialized" data, but is in fact cleard to zero. ... i used nasm's one because intel's manuals are too big and detailed for me yet:) also i hate pdf files... ...
    (alt.lang.asm)
  • Re: which way is more efficient to check bit value
    ... and 128 (the offset zero is if the original number was zero). ... low order 4 bits of the 8-bit value; if it's the high order you ... xor eax, eax ... mov bl, ...
    (microsoft.public.vc.language)
  • Re: Newbie to Forums, SIMD question
    ... Both of these converse most rapidly when x is near zero. ... mov ebx,ecx ... So far we've ignored zeros, denormals, infinities and NaNs. ... infinity or NaN, nor will it work for a negative x. ...
    (comp.lang.asm.x86)
  • Re: itoa assembly version
    ... mov byte ptr ds:, dl ... cmp ax, 0; or "test ax, ax" ... the cause of "Divide Overflow" ... zero at startup in an .exe, ...
    (alt.lang.asm)