Re: Why is my regex so slow?
- From: jialinli1981@xxxxxxxxx (Jialin Li)
- Date: Fri, 31 Oct 2008 16:34:22 -0500
try this:
print "Match\n" if($target =~ /^$regex$/*o*);
only compile regex only once
On Fri, Oct 31, 2008 at 2:54 PM, Mark Wagner <carnildo@xxxxxxxxx> wrote:
I've got a script I'm using to search through a list of Wikipedia
article titles to find ones that match certain patterns.
As-written, if you run it and supply '.*target.*' on standard input,
it will process my test file in 125 seconds. Make any of the changes
mentioned in the comments, and the time needed will drop to 1.8
seconds. Why the difference? Particularly interesting is that it
seems to matter where the regex pattern came from: if it's from
standard input, testing is slow; if it's assigned in the script,
testing is fast.
If it matters, I'm using Perl 5.8.8.
To see the problem I'm having, download
http://download.wikimedia.org/eswiki/20081018/eswiki-20081018-all-titles-in-ns0.gz
(a 4.1-MB file), unzip it, and run the program supplying the name of
the unzipped file.
Thanks,
Mark Wagner
--------------
binmode STDIN, ":utf8"; # Comment this out to speed things up
while(<STDIN>)
{
my $lines = 0;
my $lines2 = 0;
my $regex;
$regex = $_;
chomp $regex;
#$regex = '.*target.*'; # Or uncomment this to speed things up
open INFILE, "<", $ARGV[0];
binmode INFILE, ":utf8"; # Or comment this out to speed things up
while(<INFILE>)
{
my $target = $_;
chomp $target;
$target =~ s/_/ /g;
print "Match\n" if($target =~ /^$regex$/); # Or make
this case-insensitive to speed things up, or remove the start and end
anchors to speed things up
$lines = $lines + 1;
if($lines >= 10000)
{
$lines = 0;
$lines2 += 10000;
print STDERR "$lines2\r";
}
}
}
--
To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx
http://learn.perl.org/
- References:
- Why is my regex so slow?
- From: Mark Wagner
- Why is my regex so slow?
- Prev by Date: Re: signal processing INT or TERM
- Next by Date: Re: Why is my regex so slow?
- Previous by thread: Why is my regex so slow?
- Next by thread: Re: Why is my regex so slow?
- Index(es):
Relevant Pages
|