Re: Perl Strings vs FileHandle



shadabh wrote:

On Sep 17, 12:39 pm, "John W. Krahn" <some...@xxxxxxxxxxx> wrote:
shadabh wrote:
On Sep 6, 6:08 pm, "John W. Krahn" <some...@xxxxxxxxxxx> wrote:
shadabh wrote:
Hi all,
Just wanted to run this through Perl gurus to see if fit is
correct?. I have a file that could possibly be 1GB in variable
length EBCDIC data. I will read the file as EBCDIC data and based
on some criteria split it into 100 different files which will add
up to the 1GB. This way a particular copy book can be applied to
easy of the split files. The approach I am using is a filehandle
( IO::FileHandle and $Strings), substr and write out to 100
different files after applying the 'logic'. I will use two
routine, one to read and one to write, I have tested this out
with 100MB file and it works fine. The question though is there a
memory limit to this, as we are using strings to break the files.
Or is there an alternative way to do this? Comments, suggestions,
improvements and alternatives will really help to design the
code. thanks
You have to show us what "this" is first.  It is kind of hard to
make comments, suggestions, improvements and alternatives to
something that we cannot see.

John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall- Hide
quoted text -

- Show quoted text -

Sure. Here is what I meant by 'this way'. Please comment. Thanks

while ($raw_data) {

You don't modify $raw_data inside the loop so there is no point in
testing it every time through the loop.

$var_len=$INIT_LEN+$AC_NUM_LENGTH;

my $var_len = $INIT_LEN + $AC_NUM_LENGTH;

$val = substr $raw_data,$var_len,4;
$asc_string = $val;

my $asc_string = substr $raw_data, $var_len, 4;

eval '$asc_string =~ tr/\000-\377/' . $cp_037 . '/';
# open(OUTF, '>>ebcdic_ID.txt');
# #print OUTF $var_len, $asc_string, "\n";
# close OUTF;
open(PARM, '<TSY2_PARM.par') || die("Could not open the
parammeter file $PARM_FILE ! File read failed ..check iF file
exits");

You are opening 'TSY2_PARM.par' but your error message says you are
opening $PARM_FILE.  You should include the $! variable in the error
message so you know *why* it failed to open.

open my $PARM, '<', 'TSY2_PARM.par' or die "Could not open
'TSY2_PARM.par' $!";

$parm_data = <PARM>;

You assign the same data to $parm_data every time so...

if (($parm_data =~ m!($asc_string)!g) eq 1) {

your pattern will always match at the same place every time through
the loop so if the pattern is present you have an infinite loop.  You
are not using the contents of $1 so the parentheses are superfluous.
The comparison test to the string '1' is superfluous.

if ( $parm_data =~ /$asc_string/g ) {

$COPYBOOK_LEN = substr $parm_data,length($`)+4,4;

The use of $` is discouraged as it slows down *all* regular
expressions in your program.

my $COPYBOOK_LEN = substr $parm_data, $-[ 0 ] + 4, 4;

print $COPYBOOK_LEN;
close (PARM);
$OUT_DATAFILE = 'EBCDIC_'.$asc_string.'.txt';
$RECORD_LEN= $COPYBOOK_LEN+$HEADER_DATA;
open(OUTF, ">>$OUT_DATAFILE")|| die("Could not open file. File
read failed ..check id file exits");

open my $OUTF, '>>', $OUT_DATAFILE or die "Could not open
'$OUT_DATAFILE' $!";

print OUTF substr $raw_data, $INIT_LEN, $RECORD_LEN;
close OUTF;
$INIT_LEN = $INIT_LEN + $RECORD_LEN;
print $INIT_LEN;
print $var_len;
}
else {
print 'End of file reached or copy book is not a part of the
loading process', "\n";
exit 0;
}
}

So, to summarise:

open my $PARM, '<', 'TSY2_PARM.par' or die "Could not open the
parammeter file 'TSY2_PARM.par' $!";
my $parm_data = <$PARM>;
close $PARM;

while ( 1 ) {

my $var_len = $INIT_LEN + $AC_NUM_LENGTH;
my $asc_string = substr $raw_data, $var_len, 4;
eval '$asc_string =~ tr/\000-\377/' . $cp_037 . '/';

if ( $parm_data =~ /\Q$asc_string\E/g ) {

my $COPYBOOK_LEN = substr $parm_data, $-[ 0 ] + 4, 4;
my $OUT_DATAFILE = "EBCDIC_$asc_string.txt";
my $RECORD_LEN   = $COPYBOOK_LEN + $HEADER_DATA;

open my $OUTF, '>>', $OUT_DATAFILE or die "Could not open
'$OUT_DATAFILE' $!";
print $OUTF substr $raw_data, $INIT_LEN, $RECORD_LEN;
close $OUTF;

$INIT_LEN += $RECORD_LEN;
print $COPYBOOK_LEN, $INIT_LEN, $var_len;
}
else {
print "End of file reached or copy book is not a part of the
loading process\n";
last;
}
}

exit 0;

__END__

John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall

Thanks for all your suggestions/comments. So what is your idea of
using this method. is there an alternative at all??
because $parm_data could potentially contain GB's worth of data.
Suggestions to improve the method are welcome. BTW all your comments
are good and have included in my code. TY

Shadab

You said that "$parm_data could potentially contain GB's worth of data",
in which case, the below:

my $parm_data = <$PARM>;

Is going to kill your resources and probably even crash/overload the
system. You need to do a while on the <PARM> file handle and step
through it per line (instead of reading it into a string or array
first). You can cut out the middle man that way, too. Otherwise you
are reading a potentially huge amount of data into a string (into
memory) and then trying to parse it. That is always bad.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
.



Relevant Pages

  • Re: Need help understanding how a file input block works
    ... here is a section of the Perl code that I am having ... ^ says 'match if we are at the start of the string', ... 'next' is documented in perldoc perl, in the sextion "Loop Control". ... which will stop you from using globals by accident. ...
    (comp.lang.perl.misc)
  • Re: Code Comprehension
    ... procedure Remove_Unique (Target: in out String; ... for J in Source'range loop ... As to readability, it ... depends on how well you know Perl. ...
    (comp.programming)
  • Re: efficiency question about split
    ... out of the splitting of the big string $Dictionary->contents ... However a temp file would defeat the purpose. ... And 2) - what is a very general perl question: ... Since I will be running the loop many many times, ...
    (comp.lang.perl.misc)
  • Re: Code Comprehension
    ... procedure Remove_Unique (Target: in out String; ... Match_Found: Boolean; ... for J in Source'range loop ... depends on how well you know Perl. ...
    (comp.programming)
  • extension_pack
    ... It is used to set upper loop -- limits for non-deterministic values thus avoiding the use of access -- types and enabling the functions to be used for synthesizeable code. ... DivisorVal: integer) return std_logic_vector; function "/"(DividendVal: string; DivisorVal: integer) return std_logic_vector; ... for loopVar in 0 to slvVal'length/4-1 loop ... end loop; if then return not resultVar; -- "width mismatch" errors here are due to improper sizing of the vector that this function is assigned to else return resultVar; -- "width mismatch" errors here are due to improper sizing of the vector that this function is assigned to end if; ...
    (comp.lang.vhdl)