Re: File size too big for perl processing
- From: Sherman Pendley <spamtrap@xxxxxxxxxxx>
- Date: Mon, 30 Jun 2008 14:47:09 -0400
Cheez <danieldharkness@xxxxxxxxx> writes:
print "**fisher**";
$flatfile = "newrawdata.txt";
# 95MB in size
$datafile = "hashsequence16.txt";
# 203MB in size
my $filesize = -s "hashsequence16.txt";
# for use in processing time calculation
open(FILE, "$flatfile") || die "Can't open '$flatfile': $!\n";
open(FILE2, "$datafile") || die "Can't open '$flatfile': $!\n";
open (SEQFILE, ">fishersearch.txt") || die "Can't open '$seqparsed': $!
\n";
@preparse = <FILE>;
@hashdata = <FILE2>;
close(FILE);
close(FILE2);
for my $list1 (@hashdata) {
If you're looping through $datafile one line at a time, there's no
need to read the whole thing into RAM at once. Just leave the file
open, and use a while() loop to read one line at a time instead:
while (my $list1 = <FILE2>) {
# iterating through hash16 data
$finish++;
if ($finish ==10 ) {
# line counter
$marker = $marker + $finish;
$finish =0;
$left = $filesize - $marker;
printf "$left\/$filesize\n";
# this prints every 17 seconds
}
($line, $freq) = split(/\t/, $list1);
for my $rawdata (@preparse) {
# iterating through rawdata
$rawdata=~ s/\n//;
Chomp() is a faster way to remove newlines:
chomp($rawdata);
if ($rawdata =~ m/$line/) {
# matching hash16 word with rawdata line
my $first_pos = index $rawdata,$line;
Index() will scan the string a second time. There's no need to do
that, since the position of the matched expressions are already stored
in @-:
my $first_pos = $-[0];
sherm--
--
My blog: http://shermspace.blogspot.com
Cocoa programming in Perl: http://camelbones.sourceforge.net
.
- References:
- File size too big for perl processing
- From: Cheez
- File size too big for perl processing
- Prev by Date: File size too big for perl processing
- Next by Date: Re: NDBM support
- Previous by thread: File size too big for perl processing
- Next by thread: Re: File size too big for perl processing
- Index(es):
Relevant Pages
|