Re: delimited data into nested array
From: LaDainian Tomlinson (go_at_away.spam.invalid)
Date: 08/05/04
- Next message: Anno Siegel: "Re: recursive functions"
- Previous message: Anno Siegel: "Re: if ( $a eq 1 || $b eq 2...)"
- In reply to: Yup: "delimited data into nested array"
- Next in thread: Ben Morrow: "Re: delimited data into nested array"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 05 Aug 2004 08:00:42 GMT
On 2004-08-04, 'Yup' <yaweh32@yahoo.com> wrote in comp.lang.perl.misc:
> Hello,
> I'm just starting to learn Perl. Up until now I'd been struggling to
> learn how to import a tab-delimited data table (from a text file) into
> Perl as a two dimensional nested array. I wanted to be able to
> manipulate that data by accessing it using x-y coordinates, and then
> output it again.
>
> I may have been looking in the wrong place, but I had a hard time
> finding help on Google groups and other places online. However, I've
> managed to figure it out, and thought I'd post my code.
>
> For two reasons, I suppose - to help other novices, and perhaps get
> some comments from more advanced users. On the latter part, I'm
> interested in knowing what I could do to make my code run faster and
> make it more Perl-ish.
There is lots to correct, I'm afraid. No offense, but I hope that the
other novices will continue reading the thread rather than stopping at
this solution (or mine, for that matter).
> #!/usr/local/bin/perl -w
Good, but not good enough. Get rid of the -w and add:
use warnings;
use strict;
Then you'll have to put a 'my' in front of all the variables you
declare. It'll break some other things too, but I promise it's for the
best.
> #define filename and open it
> $file = 'a1.txt';
> open(INFO, $file) || die "Can't open file $file\n" ;
Let perl tell you why it died by including $! in the output:
open my $INFO, '<', $file or die "Couldn't open $file: $!";
> # read file into temporary array called "lines"
> while(<INFO>)
> {
> #chop off the carrage return
> chop $_;
> push @lines, $_;
> }
Bleh. Do you have a good reason for reading the file into an array
rather than performing the data manipulation and immediately writing to
the file? If you're reading really huge files into arrays, you're going
to run into memory problems. Use the following construct if you can:
my $outfile = 'some_other.txt';
open my $OUTFILE, '<', $outfile or die "Couldn't open $outfile: $!";
# output line terminator
$\ = "\n";
while ( <$INFO> ){
chomp; # chomp is safer than chop
my @fields = split /\t/;
my @new_fields = do_stuff_with( @fields );
print $OUTFILE join "\t" => @new_fields;
}
If you'd still like to use the arrays, read up on perllol and use this:
my @AoA; # an array of arrays
while ( <$INFO> ){
chomp;
my @fields = split /\t/;
# push references to arrays, not arrays themselves
push @AoA => \@fields;
}
my @new_AoA = do_stuff_with( @AoA );
# open, etc.
print $OUTFILE join "\t" => @$_ for @new_AoA;
That will basically do everything you wanted, but I'll put some comments
below as well.
> # Close the file
> close(INFO);
If you use lexical filehandles like above, you don't need to close them
explicitly. They are closed when they go out of scope (in this case,
when you exit the program; in others, when you exit a subroutine or
other block).
> #reset index for generating arrays named "line_{$i}"
> $i = 0;
Please do not use variables in this way. You're already familiar with
arrays, and you've probably seen hashes (associative arrays) as well.
Use them whenever you're tempted to try this. The 'use strict' above
will prevent this mistake. See:
perldoc -q "How can I use a variable as a variable name?"
> #read each array entry into new array, split with tab
> foreach (@lines)
> {
> push @{'line_'.${i}}, split("\t", $lines[$i]);
> $i++
> }
>
> #generate an array to hold the other arrays
> for ($i=0; $i<scalar(@lines); $i++)
> {
> push @A, *{'line_'.${i}};
> }
You won't very often need for loops like this in Perl. In this case,
you could use the range operator if you wanted. It's usually easier
just to loop over the list itself:
# range operator
for ( 0..$#lines ){ ... }
# aliasing
for my $line ( @lines ){ ... }
> #rename the top corner to be START
> $A[0][0] = "START";
>
> #open file to send data to
> open(OUTFILE, ">a1_edited.txt");
Always check the return value of open. Always. Yes, always.
open( ... ) or die "Oh well: $!";
> for ($i=0; $i<scalar(@lines);$i++)
> {
>
> for ($j=0;$j<scalar(@line_0);$j++)
> {
> print OUTFILE "$A[$i][$j]";
> if ($j<(scalar(@line_0)-1)) { print OUTFILE "\t";}
> }
>
> print OUTFILE "\n";
> }
> close(OUTFILE);
This whole loop was condensed above by using join(), $\, and that handy
array of arrays (array references, really).
Hopefully this will be useful to you. Perl is pretty complicated and
features a lot of little tricks to make your code cleaner. It takes a
long time to learn them all (I'm approximately 3% of the way there).
Good luck,
Brandan L.
-- bclennox \at eos \dot ncsu \dot edu
- Next message: Anno Siegel: "Re: recursive functions"
- Previous message: Anno Siegel: "Re: if ( $a eq 1 || $b eq 2...)"
- In reply to: Yup: "delimited data into nested array"
- Next in thread: Ben Morrow: "Re: delimited data into nested array"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|