Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Eric Grange <egrangeNO@xxxxxxxxxxxxxxx>
- Date: Fri, 03 Nov 2006 11:11:11 +0100
But I based my analysis on the data size, didn't I?
Oops :)
Reverse argument for the other missing parameter :)
Well, if a concatenation of empty elements is considered time consuming, the handling of the elements may become a factor.
In a complexity analysis, every operation contributes to the complexity whether or not it's considered time consuming.
The concatenation of many involves a loop, so you have the loop complexity for whatever constant-time overhead (at best) happens before or after each concatenation operation.
In typical cases the allocation of the destination memory will take the> most time, followed by the copies.
If you have an C-like strlen implementation, determining the lengths can end up taking a lot of time too... and does, in a lot of C code out there.
But then we have another absolutely unrelated parameter, the fragmentation of the memory. This also seems to be the reason, why s1=s2+s3+s4 is so slow in .NET?
It's slow in .Net because of string immutability, GC and the lack of an aggregated '+' operator (which matters most will depend on string size, number of concats, etc.).
In native Delphi s2+s3+s4 will be performed by a single function call, which will be able to produce the result string in one step.
In .Net languages using the standard '+' operator, it will be performed
as (s2+s3)+s4, ie. two strings will be concatenated first, then that result will be concatenated with the third string.
If you concat n strings, each string's data will thus be copied n/2 times on average, n-1 strings will be created, and n-2 discarded during the concatenation (all that because of immutability).
In contrast, in native Delphi with FastMM-like reallocation strategy you're looking at an average of (ln n) copies.
Finally, if your data size breaks out of a GC generation, you'll also get hit with O(n) garbage collection mark & sweep (assuming your strings are the only things GC'ed, and there are thus no cross-references). You'll also have an O(m) (m being the data size) complexity from the compaction phase of the GC (which can become a hard hitter if your data doesn't fit in cache).
That's another thing, related to marketing. How many machines could be sold, if the average user was not busy with configuring, optimizing and reinstalling Windows?
These machines exist, they're named game consoles, and they sell by the hundreds of millions :)
Eric
.
- Follow-Ups:
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Hans-Peter Diettrich
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- References:
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Eric Grange
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Hans-Peter Diettrich
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Eric Grange
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- From: Hans-Peter Diettrich
- Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- Prev by Date: Re: About .NET strategy
- Next by Date: Re: pdf metadata
- Previous by thread: Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- Next by thread: Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things
- Index(es):
Relevant Pages
|
|