Re: I need speed! Byte-for-byte comparing ...



Yeah... Don't we want jz?

That's true, I noticed that and corrected already.. a big mistake. :(

..more:
mov edx, [esi]
mov eax, [esi+4]
xor edx, [edi]
xor eax, [edi+4]
add esi, 8
add edi, 8
or edx, eax
jz .more

That gives few percent more boost, but the effect is very neglible..
one instruction less, possibly little bit less dependencies between
registers. I don't know, but this should be pretty close to how fast we
can read from the memory.. the "or" is only extra that springs to mind
in loop which is structured like this. I could be wrong, just a hunch
and besides there are so many flavours of x86 already anyway. :)

Eg. the difference computation should be practically free with this
kind of arrangement. The next step is MMX.. anyone care to guess if
and/or how much that will speed things up?

.



Relevant Pages

  • A faster integer->decimal string conversion routine
    ... xor(edx, edx); ... mov(d, eax); ... edi); // Point EDI at the first char in the buffer ...
    (alt.lang.asm)
  • Re: Hex to ascii
    ... d:dword in eax; ... var buffer:char in edi ... mov(eax, edx); ... shr(posn*4, edx); ...
    (comp.lang.asm.x86)
  • Re: Faster HexToBuffer Routines
    ... d:dword in eax; ... var buffer:char in edi ... mov(eax, edx); ... shr(posn*4, edx); ...
    (alt.lang.asm)
  • Re: Fastcode Int64Div
    ... It receives pointers to X & Y in eax and edx ... push ebx // Save EBX as per calling convention. ... push edi // Save EDI as per calling convention. ...
    (borland.public.delphi.language.basm)
  • New Fastcode IsPrime Function
    ... mov edx, $AAAAAAAB ... sub eax, ecx ... mov edi, offset InversePrimes + TableSize*4 ...
    (borland.public.delphi.language.basm)