Re: improve strlen
- From: spamtrap@xxxxxxxxxx
- Date: Mon, 24 Oct 2005 03:57:19 +0000 (UTC)
Faster way to implement strlen() is to store the length of the string,
example:
string x = ...;
int length = x.length();
This can be a simple variable dereferencing.. now the next problem is
to determine the length of string when assigning "const char* text"
into it. This is, ofcourse, a tradeoff:
- setup time is increased
- other activities are faster
On the other hand,
- "strlen" is always done -- at setup
But,
- "strlen" is ONLY done -- at setup
Ofcourse, the "length" member of string could be initialized to -1 to
signal "unknown" length and computed when value is first time queried,
sort of lazy evaluation. This would on the other hand introduce
test-branch every time the value is queried, again, it might be overall
win to just compute the length at initialization.
Now to the question (at last!), I am interested, what does the code
look like that does 16 bits or 32 bits at a time? It would help to see
what you already tried..
Here's a typical strlen() implementation in C
int strlen(const char* text)
{
const char* s = text;
for ( ; *s; ++s )
;
return (int)(s - text);
}
Visual C++ 8.1 Beta 2 compiles it like this:
mov eax, OFFSET $SG-5
$LL3@strlen:
add eax, 1
cmp BYTE PTR [eax], 0
jne SHORT $LL3@strlen
sub eax, OFFSET $SG-5
ret 0
Precisely the intention in C, translated into Assembly. Unless we have
some more clever optimizations in mind, such as testing two or more
characters with a single branch the function doesn't really.. pay off
to write in assembly..
Optimizing x86 code is much different than writing for the Pentium
these days, too.. look at Pentium 4 netburst microarchitechture, 128
registers internally.. the code is translated on-fly to risc like micro
instructions, the translated code is cached (the code cache is called
"tracecache" in P4). AMD has different approach in their K8
architechture, but knowing the x86 assembly doesn't tell jack about the
cost of the code in runtime, unless know how the internals works..
which is not very beneficial..
x86 assembly programming, well, these days I think the strong point in
favour of that sort of activity is to generate code in runtime and then
execute it. Virtual machines and realtime optimized systems where
number of permutations for computation are too numerous spring to mind.
Definitely areas where other alternatives are smoked alive. But that
requires know-how to write optimizing compiler (atleast the backend).
That said, I think writing strlen() in assembly is not very productive,
considering how good the compilers these days are getting (Intel, GNU,
Microsoft..) -- but that's just me, don't be discouraged. :)
.
- Follow-Ups:
- Re: improve strlen
- From: hutch--
- Re: improve strlen
- From: spamtrap
- Re: improve strlen
- References:
- improve strlen
- From: Claudio Daffra
- improve strlen
- Prev by Date: Re: compiler generated output
- Next by Date: Re: Floating point exception
- Previous by thread: improve strlen
- Next by thread: Re: improve strlen
- Index(es):
Relevant Pages
|
|