Re: improve strlen
- From: "jukka@xxxxxxxxxxxx" <spamtrap@xxxxxxxxxx>
- Date: Thu, 27 Oct 2005 02:21:12 +0000 (UTC)
If you going to make a big deal out of different backends, atleast then
take some effort to use the instructions which are available, something
along these lines...
pxor mm0,mm0
xloop:
pcmpeqb mm0,[esi]
pmovmskb eax,mm0
pxor mm0,mm0
add esi,8
// ...
Feel free to change the order if you think it helps
- do a 64 bit / 8 component compare, the dest will have all byte cells
initialized with 0xff if the corresponding byte value in source operand
was zero, 0x00 otherwise. Then pack the MSB of each component into 32
bit register.
>>From there on the rest is too trivial to even mention.. about
instruction counts, your technique uses 1.5 instructions (roughly) per
char, this does 0.5 instructions (roughly) per char, and uses 64 bit
aligned reads. How much faster it is in practise.. if you want to
know.. find out!
p.s. don't worry I been around..
.
- Follow-Ups:
- Re: improve strlen
- From: hutch--
- Re: improve strlen
- References:
- improve strlen
- From: Claudio Daffra
- Re: improve strlen
- From: spamtrap
- Re: improve strlen
- From: hutch--
- Re: improve strlen
- From: spamtrap
- Re: improve strlen
- From: jukka@xxxxxxxxxxxx
- Re: improve strlen
- From: jukka@xxxxxxxxxxxx
- Re: improve strlen
- From: jukka@xxxxxxxxxxxx
- Re: improve strlen
- From: hutch--
- Re: improve strlen
- From: jukka@xxxxxxxxxxxx
- Re: improve strlen
- From: randyhyde@xxxxxxxxxxxxx
- Re: improve strlen
- From: hutch--
- improve strlen
- Prev by Date: The never ending assembly vs. HLL war
- Next by Date: Re: Why gcc translate a c program into assemble as follow
- Previous by thread: Re: improve strlen
- Next by thread: Re: improve strlen
- Index(es):