Re: why a.pl is faster than b.pl



Hi,bob,

You said:

3. It will probably be faster to use a single regex of the format:

/pata|patb|patc|patd/


In fact maybe you are wrong on this.Based on my test case,the RE written as below:

/pata/ || /patb/ || /patc/ || /patd/

is much faster than yours.


-----Original Message-----
>From: Bob Showalter <bob_showalter@xxxxxxxxxxxxxxx>
>Sent: Dec 29, 2005 2:54 AM
>To: Jeff Pang <pangj@xxxxxxxxxxxxx>
>Cc: beginners@xxxxxxxx
>Subject: Re: why a.pl is faster than b.pl
>
>Jeff Pang wrote:
>> hi,lists,
>>
>> I have two perl scripts as following:
>>
>> a.pl:
>> ----
>> #!/usr/bin/perl
>> use strict;
>>
>> my @logs = glob "~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*";
>>
>> foreach my $log (@logs) {
>> open (HD,$log) or die "$!";
>> while(<HD>){
>>
>> if (
>> ($_ =~ /?¢²á/o) ||
>> ($_ =~ /?÷??/o) ||
>> ($_ =~ /?¥µ®¿ì??/) ||
>> ($_ =~ /?¦?¸/o) ||
>> ($_ =~ /?ø?¨/o) ||
>> ($_ =~ /·¢»õ/o) ||
>> ($_ =~ /±±¾©/o) ||
>> ($_ =~ /????/o) ||
>> ($_ =~ /???¢/o) ||
>> ($_ =~ /?ã?½/o) ||
>> ($_ =~ /°??ò/o) ||
>> ($_ =~ /?â·?/o) ) {
>> print $_;
>> }
>> }
>> close HD;
>> }
>>
>>
>> b.pl
>> ----
>> #!/usr/bin/perl
>> use strict;
>>
>> my $ref = sub { $_[0] =~ /?¢²á/o || $_[0] =~ /?÷??/o || $_[0] =~ /?¥µ®¿ì??/o ||
>> $_[0] =~ /?¦?¸/o || $_[0] =~ /?ø?¨/o || $_[0] =~ /·¢»õ/o ||
>> $_[0] =~ /±±¾©/o || $_[0] =~ /????/o || $_[0] =~ /???¢/o ||
>> $_[0] =~ /?ã?½/o || $_[0] =~ /°??ò/o || $_[0] =~ /?â·?/o };
>>
>>
>> my @logs = glob "~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*";
>>
>> foreach my $log (@logs) {
>> open (HD,$log) or die "$!";
>> while(<HD>){
>> print if $ref->($_);
>> }
>> close HD;
>> }
>>
>>
>> I run the 'time' command to get the running speed:
>>
>> time perl a.pl > /dev/null
>>
>> real 0m0.190s
>> user 0m0.181s
>> sys 0m0.008s
>>
>>
>> time perl b.pl > /dev/null
>>
>> real 0m0.286s
>> user 0m0.278s
>> sys 0m0.007s
>>
>>
>> Why the a.pl is faster than b.pl? I think ever the resulte should be opposite.Thanks.
>>
>
>Well, the time differences aren't dramatic. But off hand, I would say
>that a.pl is faster because no subroutine call is involved.
>
>A couple of other observations:
>
>1. /o is useless on these regexes, since they don't interpolate any
>variables.
>
>2. $_ is the default target for the m// operator, so
>
> $_ =~ /regex/
>
>can be replaced with simply
>
> /regex/
>
>3. It will probably be faster to use a single regex of the format:
>
> /pata|patb|patc|patd/
>
>If the alternation can stay inside the regex code rather than happening
> out at the Perl opcode level, it might be faster.

.