Re: How to boost performance of my crude script?

DB <poo@xxxxxxx> wrote in

Hi - I admit to being a total hack at perl, but I did get the below
script to work. The problem is that is goes quite slowly.

I haven't yet looked at your script yet, but at first sight, it looks
very disorganized and does not follow best practices.

The code parses a comma delimited text file of approx 250,000 lines
and generates another text file with modified format. For certain
lines of the data, the code must parse again through the same file and
I believe this is why it is slow.


perhaps by loading the file into RAM all at once?

Depends on how much RAM you have, how many times you have to re-iterate
through the data and how loaded the system is etc.

On the other hand, if you do have to go through data many times, just
create an SQLite database to make subsequent lookups faster.


use strict;


use warnings;

if (-e "output.txt") {
die "outfile already exists on system";

my $output_fn = 'output.txt';

die "'$output_fn' already exists\n" if -e $output_fn;

open (MYFILE, '>>output.txt');

Why are you opening the file for appending if your program is supposed
to die if it already exists?

Always, *always* check if open succeded. Use lexical filehandles for
better scoping:

open $output, '>', $output_fn
or die "Cannot open '$output_fn' for writing: $!";

# Print header row
print MYFILE "sku\tdesc\tremark\tprice\tgroup\tcat\twt\ttit\n";

print $output
join("\t", qw( sku desc remark price group cat wt tit )), "\n";


my $count;

Why do you need this variable? Are you just counting lines? Use $. for

$use="USE"; # define trigger strings

$file = 'data.txt'; # open file
open (F, $file) || die ("Could not open $file!");

while ($line = <F>)

$count=$count + 1;;
my @fields1 = split(',', $line); # load array

The rest is a mess.

The fact that you don't declare variables in the smallest possible scope
makes verything really hard to follow. Indentation is a mess. Your code
is not self explanatory and I cannot figure out what supered means.

# Initialize output variables

print "Working on item |$fields1[1]|, length= $description_length,
SKU= $SKU count=$count \n";

Ahem, $description_length is undefined at this point.

$supered="NO"; #initialize

$desc_portion = substr($fields1[2],0,3);
if ($desc_portion eq $use) { # if supered...

$length_leftover = (length($fields1[2])-4); # find the super
$sup_pn = substr($fields1[2],4,$length_leftover);

This is insane.

if ($sup_pn eq $fields1[1]) { #if supers to self
print "|$fields1[1]| supers to |$sup_pn|... \n";

You print out this message even if $supered is set to 'NO'. More

while ($supered eq "YES")

You are using a loop when, it looks like, you should be using a

Since I just do not have any idea what you are trying to achieve, and no
input data, I gave up after trying for 20 minutes to refactor your code.

Please read the posting guidelines for this group.