Re: Coding Horrors, Cargo Cult Programming, and other Ghoulish Things



But I based my analysis on the data size, didn't I?

Oops :)
Reverse argument for the other missing parameter :)

Well, if a concatenation of empty elements is considered time consuming, the handling of the elements may become a factor.

In a complexity analysis, every operation contributes to the complexity whether or not it's considered time consuming.
The concatenation of many involves a loop, so you have the loop complexity for whatever constant-time overhead (at best) happens before or after each concatenation operation.

In typical cases the allocation of the destination memory will take the
> most time, followed by the copies.

If you have an C-like strlen implementation, determining the lengths can end up taking a lot of time too... and does, in a lot of C code out there.

But then we have another absolutely unrelated parameter, the fragmentation of the memory. This also seems to be the reason, why s1=s2+s3+s4 is so slow in .NET?

It's slow in .Net because of string immutability, GC and the lack of an aggregated '+' operator (which matters most will depend on string size, number of concats, etc.).

In native Delphi s2+s3+s4 will be performed by a single function call, which will be able to produce the result string in one step.
In .Net languages using the standard '+' operator, it will be performed
as (s2+s3)+s4, ie. two strings will be concatenated first, then that result will be concatenated with the third string.

If you concat n strings, each string's data will thus be copied n/2 times on average, n-1 strings will be created, and n-2 discarded during the concatenation (all that because of immutability).
In contrast, in native Delphi with FastMM-like reallocation strategy you're looking at an average of (ln n) copies.

Finally, if your data size breaks out of a GC generation, you'll also get hit with O(n) garbage collection mark & sweep (assuming your strings are the only things GC'ed, and there are thus no cross-references). You'll also have an O(m) (m being the data size) complexity from the compaction phase of the GC (which can become a hard hitter if your data doesn't fit in cache).

That's another thing, related to marketing. How many machines could be sold, if the average user was not busy with configuring, optimizing and reinstalling Windows?

These machines exist, they're named game consoles, and they sell by the hundreds of millions :)

Eric
.



Relevant Pages

  • Re: syntax...
    ... B&D on the part of the language designer. ... probably handle concatenation of string literals by itself, ... bitwise XOR, or if not that, then exponentiation.) ...
    (comp.lang.misc)
  • Re: =?iso-8859-1?q?Re:_Kolmorgorov_Complexity_and_Kim_=D8yhus?=
    ... >>>As far as Kolmogorov Complexity, let's look at the basic definition one ... When discussion a string of characters. ... >>>the degree of a string's compressibility. ... >>>plays an important role in the concept of chaos. ...
    (talk.origins)
  • Re: Just for fun...
    ... complexity in practical scenarios... ... corresponding complexities for a given string can differ at most by a constant ... When I first learned about the concept of compression, ... surprise in the questions/issues being discussed, ...
    (comp.lang.ruby)
  • Bugs in the Module::Dependency
    ... Manifying blib/man1/pmd_indexer.plx.1 ... Use of uninitialized value in concatenation or string at ... # Failed test in t/04grapher.t at line 81. ...
    (perl.dbi.users)
  • =?iso-8859-1?q?Re:_Kolmorgorov_Complexity_and_Kim_=D8yhus?=
    ... >>> erm, if your string is minimum, then it is compressed for the reference ... The sequence of an otherwise functional string may be ... >> compressed without regard to its function, but this compression will ... When Sean says that his definition of functional complexity includes ...
    (talk.origins)