Re: delimited data into nested array
From: Ben Morrow (usenet_at_morrow.me.uk)
Date: 08/04/04
- Next message: Ben Morrow: "Re: @platforms = (sort keys %prj_platforms) || (DEFAULT);"
- Previous message: Brian McCauley: "Re: if ( $a eq 1 || $b eq 2...)"
- In reply to: Yup: "delimited data into nested array"
- Next in thread: Tassilo v. Parseval: "Re: delimited data into nested array"
- Reply: Tassilo v. Parseval: "Re: delimited data into nested array"
- Reply: Anno Siegel: "Re: delimited data into nested array"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 4 Aug 2004 19:54:42 +0100
Quoth yaweh32@yahoo.com (Yup):
> Hello,
> I'm just starting to learn Perl. Up until now I'd been struggling to
> learn how to import a tab-delimited data table (from a text file) into
> Perl as a two dimensional nested array. I wanted to be able to
> manipulate that data by accessing it using x-y coordinates, and then
> output it again.
>
> I may have been looking in the wrong place, but I had a hard time
> finding help on Google groups and other places online. However, I've
> managed to figure it out, and thought I'd post my code.
>
> For two reasons, I suppose - to help other novices, and perhaps get
> some comments from more advanced users. On the latter part, I'm
> interested in knowing what I could do to make my code run faster and
> make it more Perl-ish.
>
> Thanks for your help ahead of time.
>
> #!/usr/local/bin/perl -w
use strict;
use warnings;
warnings is a more modern replacement for -w: see perldoc warnings.
strict helps catch common coding errors.
> #define filename and open it
> $file = 'a1.txt';
> open(INFO, $file) || die "Can't open file $file\n" ;
open my $INFO, '<', $file or die "Can't open $file: $!";
The lexical FH ('my $INFO' vs simply 'INFO') means it will close when it
goes out of scope. The '<' protects against nasty filenames. Using 'or'
instead of '||' removes the need for brackets.
> # read file into temporary array called "lines"
my @lines;
as you are now using strictures.
> while(<INFO>)
> {
> #chop off the carrage return
> chop $_;
Use chomp instead of chop to remove newlines: you might be using "\r\n"
instead of just "\n".
> push @lines, $_;
> }
>
> # Close the file
> close(INFO);
This whole section can be compressed into
my @lines = do {
open my $INFO, '<', $file or die "...";
chomp <$INFO>;
};
<$INFO> in list context will return all the lines if the file.
> #reset index for generating arrays named "line_{$i}"
> $i = 0;
my $i;
There is no need to initialize to 0.
> #read each array entry into new array, split with tab
> foreach (@lines)
> {
More usual style would be
for (@lines) {
> push @{'line_'.${i}}, split("\t", $lines[$i]);
Ohmygoodnessme. These are 'symrefs', and are a *very* bad idea. They are
such a bad idea that 'use strict' will prevent you from using them. What
you want here is an array; and there's no need to keep count of your
array indices as $lines[$i] is already put in $_ by the 'for':
my @A;
# A better name than @A is probably appropriate...
for (@lines) {
push @A, [ split "\t" ];
}
or, more Perlishly, you could use map:
my @A = map { [ split /\t/ ] }, @lines;
The [...] construct is an array ref constructor: see perldoc perlreftut.
> $i++
> }
>
> #generate an array to hold the other arrays
> for ($i=0; $i<scalar(@lines); $i++)
Don't use C-style loops. It's not good Perl style.
There's no need for that explicit 'scalar': $i < @lines will work
perfectly well.
> {
> push @A, *{'line_'.${i}};
I'm slightly amazed this even works... I would have expected you to
needed to say *{...}{ARRAY}... anyway, it's YUCK. You shouldn't be
messing with globs (the '*' things) unless you *really* know what you're
doing.
You've already done this, now, anyway (one advantage of doing things
right in the first place :)...
> }
>
> #rename the top corner to be START
> $A[0][0] = "START";
>
> #open file to send data to
> open(OUTFILE, ">a1_edited.txt");
Output can fail too:
open my $OUTFILE, '>', 'a1_edited.txt'
or die "can't create a1_edited.txt: $!"
> for ($i=0; $i<scalar(@lines);$i++)
> {
>
> for ($j=0;$j<scalar(@line_0);$j++)
> {
> print OUTFILE "$A[$i][$j]";
Don't quote things when you don't need to.
> if ($j<(scalar(@line_0)-1)) { print OUTFILE "\t";}
> }
>
> print OUTFILE "\n";
> }
for (@A) {
print $OUTFILE join( "\t" => @$_ ), "\n";
}
or even
print $OUTFILE join( "\n", map { join "\t" => @$_ } @A ), "\n";
or, using Perl's special output variables (this is how I'd do it):
{
open my $OUTFILE, '>', '...' or die "...";
local ($,, $\) = ("\t", "\n");
print $OUTFILE @$_ for @A;
}
The 'local's keep the changes to $, and $/ to within the braces.
> close(OUTFILE);
Again, I would use a scope (set of braces) to close the file unless I
wanted to check for an error on close, which might be a good idea...
Ben
-- Musica Dei donum optimi, trahit homines, trahit deos. | Musica truces molit animos, tristesque mentes erigit. | ben@morrow.me.uk Musica vel ipsas arbores et horridas movet feras. |
- Next message: Ben Morrow: "Re: @platforms = (sort keys %prj_platforms) || (DEFAULT);"
- Previous message: Brian McCauley: "Re: if ( $a eq 1 || $b eq 2...)"
- In reply to: Yup: "delimited data into nested array"
- Next in thread: Tassilo v. Parseval: "Re: delimited data into nested array"
- Reply: Tassilo v. Parseval: "Re: delimited data into nested array"
- Reply: Anno Siegel: "Re: delimited data into nested array"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|