Re: Regex speed

From: A.M. Kuchling (amk_at_amk.ca)
Date: 10/29/04


Date: Fri, 29 Oct 2004 13:51:16 -0500

On Fri, 29 Oct 2004 20:35:28 +0200,
        Reinhold Birkenfeld <reinhold-birkenfeld-nospam@wolke7.net> wrote:
> But my actual question is why Perl can run the same regexes in what
> seems no time at all.

Probably Perl's engine does some particular bit of optimization that
Python's engine isn't doing. I have no idea what optimization that might
be.

> But /usr/lib/python2.3/sre*.py are relatively large for that; what's in
> there?

That code is the compiler that turns expressions into the bytecode that the
regex engine runs. Having it be in Python only matters to the performance
of re.compile(), and that function usually isn't a bottleneck.

> You're right again. Is the pre module using the PCRE C library?

An extensively hacked version of it. Modules/pypcre.c contains the bulk of
it; pcremodule.c is the Python interface to it.

--amk



Relevant Pages

  • Re: Optmization
    ... > the more defensive armour I give it, ... > I'll use a bigger engine! ... But that engine will need protecting, ... speed/size tradeoffs are very common in optimization. ...
    (comp.lang.c)
  • Re: Data Feed Optimization
    ... Data feed Optimization involved three basic activities ... The Product Data Feeds must comply with the Data ... specific product category of every shopping engine. ...
    (alt.internet.search-engines)
  • Data Feed Optimization
    ... Data feed optimization is one of the most important factors leading to ... The Product Data Feeds must comply with the Data ... specific product category of every shopping engine. ...
    (alt.internet.search-engines)