Re: string concatenation optimizations [from python-dev Summary]

From: Phil Frost (indigo_at_bitglue.com)
Date: 08/25/04


Date: Tue, 24 Aug 2004 18:53:35 -0400
To: python-announce-list@python.org, python-list@python.org

Has adding a stringish object that supports efficient slicing,
concatenation, and mutation been considered? The C++ STL rope comes to
mind. Essentially what I have in mind is a type that's a list of byte
arrays. The value is defined as the concatenation of these arrays.

This would allow efficient implementations of things such as
s[31:35] = 'replacing a small substring with a larger one'

I think a reasonable implementation could be done using existing python
types, and if it's useful, an opmitized C implementation could be done.

This sort of thing is already on my stack of things to find on google,
write, or get someone else to write, just need the time. So what do you
think? Useful idea? Does this already exist?

On Mon, Aug 23, 2004 at 09:50:48PM -0700, Brett Cannon wrote:
> python-dev Summary for 2004-08-01 through 2004-08-15
>
> [snip]
>
> -------------------------------------------------------------------------------------
> Changing the Big-O complexity for something in the language is now a
> language feature
> -------------------------------------------------------------------------------------
> language evolution
>
> Armin Rigo came up with a way to have string concatenation in a loop
> (think ``for thing in iter_of_strings: concat_str += thing``) not be a
> quadratic algorithm thanks to some trickery for ``a = a + b`` and ``a +=
> b`` conditions for strings. The hope was to remove the (commonly
> considered) wart of having ``"".join(iter_of_strings)`` be the suggested
> way to concatenate a bunch of strings.
>
> But Guido didn't like the patch. His reasoning was that changing
> something that led to a change in the Big-O complexity of certain
> algorithms would inherently hurt other implementations of Python when
> people would start to code specifically for that performance gain. For
> instance, having Jython be able to pull this trick off is, I believe,
> near impossible. So, in order to make sure changes like this are
> considered before applying them, Guido instated a new rule that
> "implementation features that affect not just the running speed but the
> O() rate for certain algorithms should be considered language features,
> because any implementation will be required to implement them in order
> to ensure code portability" between implementations of Python.
>
> In the end, though, this went in with a warning that the speed
> performance is not portable. It is not to be used in the stdlib ever.
>
> Contributing threads:
> - `Optimized string concatenation
> <http://mail.python.org/pipermail/python-dev/2004-August/046686.html>`__
> - `PEP 0008 confusion - here it is, but don't use it?
> <http://mail.python.org/pipermail/python-dev/2004-August/047194.html>`__
>
> [snip]



Relevant Pages

  • Re: string concatenation optimizations [from python-dev Summary]
    ... as arrays of strings behind the scenes, and it'll be faster than a class. ... Essentially what I have in mind is a type that's a list of byte ...
    (comp.lang.python)
  • Re: why use the data command parameter collection
    ... cumbersome than using an iterative object oriented approach. ... quote characters or other such input by concatenating strings. ... Parameters Collection. ... Also the approach string concatenation leaves a lot to be desired (see ...
    (microsoft.public.dotnet.framework.adonet)
  • Re: deepening into fortran 90,95, 2003
    ... I've been bitten many times by the slowness of concatenation of strings. ... Support the Original G95 Project: http://www.g95.org ... Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html ...
    (comp.lang.fortran)
  • Re: concat vs variable in string
    ... Or does it still do a concatenation behind the scenes? ... If you use single quotes instead of double, ... massive strings or a long loop. ... to be wrapped in double quoted strings, ...
    (comp.lang.php)
  • Re: From D
    ... string concatenation and allow numeric literals to implicitly concatenate? ... Did you miss the bit where Python ALREADY does this for strings? ... because two int tokens can be "concatenated" to make a single int token, ...
    (comp.lang.python)