Re: Handling Large files (a few Gb) in Perl
- From: Paul Lalli <mritty@xxxxxxxxx>
- Date: Mon, 16 Jul 2007 05:15:31 -0700
On Jul 16, 7:40 am, "sydc...@xxxxxxxxx" <sydc...@xxxxxxxxx> wrote:
I am a beginner (or worse) at Perl.
I have a need to find the longest line (record) in a file. The below
code works neatly for small files.
But when I need to read huge files (in the order of Gb), it is very
slow.
Could someone help me in finding what way I could make Perl work the
best way for processing huge files such as these?
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
my $prev=-1;
my $curr=0;
my ($sec,$min,$hour,$com) = localtime(time);
print "Start time - $hour:$min:$sec \n";
open(F1, "c:\\perl\\syd\\del.txt");
while (<F1>)
{
$curr = index($_, "\x0A");
Well here's one improvement you could make. Don't force Perl to
search through each string looking for a specific character. Just ask
it what the lenght of the string is. In my tests, that's about 10%
faster:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw/:all/;
sub use_index {
open my $fh, '<', 'ipsum.txt' or die $!;
my $prev = 0;
while (<$fh>) {
my $cur = index($_, "\x0A");
if ($cur > $prev) {
$prev = $cur;
}
}
}
sub use_length {
open my $fh, '<', 'ipsum.txt' or die $!;
my $prev = 0;
while (<$fh>) {
my $cur = length;
if ($cur > $prev) {
$prev = $cur;
}
}
}
cmpthese(timethese(100_000, { length => \&use_length, index =>
\&use_index }));
__END__
Benchmark: timing 100000 iterations of index, length...
index: 26 wallclock secs (19.81 usr + 6.27 sys = 26.08 CPU) @
3834.36/s (n=100000)
length: 24 wallclock secs (17.10 usr + 6.47 sys = 23.57 CPU) @
4242.68/s (n=100000)
Rate index length
index 3834/s -- -10%
length 4243/s 11% --
Paul Lalli
.
- Follow-Ups:
- Re: Handling Large files (a few Gb) in Perl
- From: sydches@xxxxxxxxx
- Re: Handling Large files (a few Gb) in Perl
- Prev by Date: Re: FAQ 5.6 How do I make a temporary file name?
- Next by Date: FAQ 4.15 How can I take a string and turn it into epoch seconds?
- Previous by thread: FAQ 5.9 How can I use a filehandle indirectly?
- Next by thread: Re: Handling Large files (a few Gb) in Perl
- Index(es):
Relevant Pages
|
Loading