Re: delimited data into nested array

From: Ben Morrow (usenet_at_morrow.me.uk)
Date: 08/04/04


Date: Wed, 4 Aug 2004 19:54:42 +0100


Quoth yaweh32@yahoo.com (Yup):
> Hello,
> I'm just starting to learn Perl. Up until now I'd been struggling to
> learn how to import a tab-delimited data table (from a text file) into
> Perl as a two dimensional nested array. I wanted to be able to
> manipulate that data by accessing it using x-y coordinates, and then
> output it again.
>
> I may have been looking in the wrong place, but I had a hard time
> finding help on Google groups and other places online. However, I've
> managed to figure it out, and thought I'd post my code.
>
> For two reasons, I suppose - to help other novices, and perhaps get
> some comments from more advanced users. On the latter part, I'm
> interested in knowing what I could do to make my code run faster and
> make it more Perl-ish.
>
> Thanks for your help ahead of time.
>
> #!/usr/local/bin/perl -w

use strict;
use warnings;

warnings is a more modern replacement for -w: see perldoc warnings.
strict helps catch common coding errors.

> #define filename and open it
> $file = 'a1.txt';
> open(INFO, $file) || die "Can't open file $file\n" ;

open my $INFO, '<', $file or die "Can't open $file: $!";

The lexical FH ('my $INFO' vs simply 'INFO') means it will close when it
goes out of scope. The '<' protects against nasty filenames. Using 'or'
instead of '||' removes the need for brackets.

> # read file into temporary array called "lines"

my @lines;

as you are now using strictures.

> while(<INFO>)
> {
> #chop off the carrage return
> chop $_;

Use chomp instead of chop to remove newlines: you might be using "\r\n"
instead of just "\n".

> push @lines, $_;
> }
>
> # Close the file
> close(INFO);

This whole section can be compressed into

my @lines = do {
    open my $INFO, '<', $file or die "...";
    chomp <$INFO>;
};

<$INFO> in list context will return all the lines if the file.

> #reset index for generating arrays named "line_{$i}"
> $i = 0;

my $i;

There is no need to initialize to 0.

> #read each array entry into new array, split with tab
> foreach (@lines)
> {

More usual style would be

for (@lines) {

> push @{'line_'.${i}}, split("\t", $lines[$i]);

Ohmygoodnessme. These are 'symrefs', and are a *very* bad idea. They are
such a bad idea that 'use strict' will prevent you from using them. What
you want here is an array; and there's no need to keep count of your
array indices as $lines[$i] is already put in $_ by the 'for':

my @A;

# A better name than @A is probably appropriate...

for (@lines) {
    push @A, [ split "\t" ];
}

or, more Perlishly, you could use map:

my @A = map { [ split /\t/ ] }, @lines;

The [...] construct is an array ref constructor: see perldoc perlreftut.

> $i++
> }
>
> #generate an array to hold the other arrays
> for ($i=0; $i<scalar(@lines); $i++)

Don't use C-style loops. It's not good Perl style.

There's no need for that explicit 'scalar': $i < @lines will work
perfectly well.

> {
> push @A, *{'line_'.${i}};

I'm slightly amazed this even works... I would have expected you to
needed to say *{...}{ARRAY}... anyway, it's YUCK. You shouldn't be
messing with globs (the '*' things) unless you *really* know what you're
doing.

You've already done this, now, anyway (one advantage of doing things
right in the first place :)...

> }
>
> #rename the top corner to be START
> $A[0][0] = "START";
>
> #open file to send data to
> open(OUTFILE, ">a1_edited.txt");

Output can fail too:

open my $OUTFILE, '>', 'a1_edited.txt'
    or die "can't create a1_edited.txt: $!"

> for ($i=0; $i<scalar(@lines);$i++)
> {
>
> for ($j=0;$j<scalar(@line_0);$j++)
> {
> print OUTFILE "$A[$i][$j]";

Don't quote things when you don't need to.

> if ($j<(scalar(@line_0)-1)) { print OUTFILE "\t";}
> }
>
> print OUTFILE "\n";
> }

for (@A) {
    print $OUTFILE join( "\t" => @$_ ), "\n";
}

or even

print $OUTFILE join( "\n", map { join "\t" => @$_ } @A ), "\n";

or, using Perl's special output variables (this is how I'd do it):

{
    open my $OUTFILE, '>', '...' or die "...";
    local ($,, $\) = ("\t", "\n");
    print $OUTFILE @$_ for @A;
}

The 'local's keep the changes to $, and $/ to within the braces.

> close(OUTFILE);

Again, I would use a scope (set of braces) to close the file unless I
wanted to check for an error on close, which might be a good idea...

Ben

-- 
Musica Dei donum optimi, trahit homines, trahit deos.    |
Musica truces molit animos, tristesque mentes erigit.    |   ben@morrow.me.uk
Musica vel ipsas arbores et horridas movet feras.        |


Relevant Pages

  • Re: matching password problem
    ... and turn on warnings. ... You don't need to store the usernames and passwords in arrays in your code, because you only need to look at ONE username and ONE password at a time. ... And if you DID want to store them, you should use a hash, not an array. ... use strict; use warnings; ...
    (perl.beginners)
  • RE: simple references question
    ... Perl will not. ... Subject: simple references question ... If you want to alter the contents of the original array, ... Always use strict and warnings. ...
    (perl.beginners)
  • Re: debugger exiting
    ... strict and warnings pragmas. ... I think portraying Perl as a command-line tool limits it to fewer platforms than ... work only as a Unix shell command line. ...
    (perl.beginners)
  • Re: dns querry script.
    ... use warnings; ... use strict; ... C:\Dload> perl dns.pl ...
    (comp.lang.perl.misc)
  • Re: Any way to access global variable in Perl script from one module file?
    ... use strict; ... use warnings; ... a separate process has a completely separate memory space. ... There is probably no reason to create a new perl ...
    (comp.lang.perl.misc)