Re: Speed comparison of regex versus index, lc, and / /i



Ben Bullock <benkasminbullock@xxxxxxxxx> wrote in
news:g1q5ci$mf$1@xxxxxxxxxxxxxxxx:

On Fri, 30 May 2008 14:55:04 +0000, xhoster wrote:


....

And of course, if you are interested in where the string matches
(i.e. the return value of index, and not just whether or not it is
-1) then it is simpler to get it from index than from a regex.

Really? Please edit the following to show me how:

There is one change I would make to both routines. Cache length $ss
before the loop. On my machine, that cut down the time by about 40%.

#!/usr/local/bin/perl
use warnings;
use strict;

sub index_find
{
my ($text, $ss) = @_;
my @finds;
my $found = 0;
while (1) {
$found = index ($text, $ss, $found);
last if $found == -1;
push @finds, $found;
$found += length ($ss);
}
return \@finds;
}

sub regex_find
{
my ($text, $ss) = @_;
my @finds;
while ($text =~ /\Q$ss\E/g) {
push @finds, pos ($text) - length($ss);
}
return \@finds;
}

Here is my modified version. I do find the index_find below simpler than
then your index_find.

#!/usr/local/bin/perl

use warnings;
use strict;

my $text = <<EOF;
xhoster is the coolest perl programmer ever. xhoster is the
greatest. xhoster is the champion. xhoster is a babe magnet.
EOF
my $ss = "xhoster";


sub index_find {
my @finds;
my $length = length $ss;

for ( my $found = index $text, $ss, 0;
$found >= 0;
$found = index $text, $ss, $found ) {
push @finds, $found;
$found += $length;
}
return \@finds;
}

sub regex_find {
my @finds;
my $length = length $ss;
while ($text =~ /\Q$ss\E/g) {
push @finds, pos ($text) - $length;
}
return \@finds;
}

use Benchmark qw( cmpthese );

cmpthese -30, {
'index' => \&index_find,
'regex' => \&regex_find,
};

__END__

C:\Temp> v
Rate regex index
regex 194450/s -- -4%
index 202488/s 4% --

--
A. Sinan Unur <1usa@xxxxxxxxxxxxxxxxxxx>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
.



Relevant Pages