Re: Extract range of lines from a text file



Dr.Ruud wrote:
Amer Neely schreef:

I'm walking through a mailbox file, and want to pull out specific
lines from each message. The body of each message is in a similar
format, having been generated by a script.

I'm doing OK except for one particular block of lines, the customer
address data. There is a blank line before and after this block.
Example:

Transaction Time: 18:45:55

Amer Neely
POB 1481 Station Main
North Bay ON
P1B 8K7
CANADA

123-456-7890


Or use a simplified state machine.

my $state = -1;
my $line = -1;

while (<>) {
chomp; # s/^\s+//; s/\s+$//;

if (-1 == $state) {
if (/^Transaction Time:/) {
++$state;
}
}
elsif (0 == $state) {
if (/^$/) {
++$state;
$line = 0;
}
else {
die "$state: <$_>?";
}
}
elsif (1 == $state) { # in address
if (^$) {
# skip
}
elsif (/^\d{3}-\d{3}-\d{4}$/) {
$state = -1;
$line = -1;
}
else {
++$line;
print "$line: $_\n";
}
}
else {
die "$state: <$_>?";
}
}

(untested)


Very interesting. It's a little more complex than I need (see the reply by Xicheng Jia dated today 11:57). That works for me.

I did adopt your code (with a few minor fixes) to my situation, but got an error when I ran it on my test input file:
Sun Apr 9 12:29:32 2006
0: <xxxxxxxxxxxx
xxxxxxxxxxxxxxx
SAULT STE MARIE Ontario
P6A 3P4
CANADA>? at parse_mail7.pl line 39, <IN> chunk 2.

Thank you for this very different approach. I will keep it in mind for other situations.
--
Amer Neely
Home of Spam Catcher
W: www.softouch.on.ca
E: trudge@xxxxxxxxxxxxxx
Perl | MySQL | CGI programming for all data entry forms.
"We make web sites work!"
.