Re: regex newbie question

From: Bob Walton (see_sig_at_invalid)
Date: 03/28/05


Date: Sun, 27 Mar 2005 21:33:44 -0500

ZMAN wrote:

...
> Reading in lines from a file.
> I want to ignore all the text before it gets to the line
> "<!--document_starts_here-->"
> and write out the remainder of the text to a file.
>
> This is the way I'm attempting this..
...

> BZ
>

Let Perl give you all the help it can:

use strict;
use warnings;

Also, please post a short but complete example illustrating the
problem that anyone can copy/paste/execute. Yours fails this on
several accounts: needed variables like $data_file are not
defined, the content of @lines is not defined, etc. See below
for suggested style.

> open DATAOUT, ">$data_file" or die "can't open $data_file $!";
>
> foreach $line (@lines)
> {
>
> if ($line =~ m/<!--document_starts_here-->/i)
> {
> print "This line contains the word : $line\n";
>
> #### write remainder of file out
> }
>
> print DATAOUT "$line";

This print is executed unconditionally, so it will output every
line that is read in. You need to execute this only on lines
following your "starts_here" flag line.

>
> }
>
>
> close (DATAOUT)

Here is one way:

use warnings;
use strict;
my @lines=<DATA>;
my $printok=0;
foreach my $line (@lines){
      print "to DATAOUT ->$line" if $printok;
      if ($line =~ m/<!--document_starts_here-->/i){
           print "This line contains the word : $line";
           $printok++;
      }
}
__END__
This line should be ignored
<!--document_starts_here-->
This is line 3
line 4
last line

You may also note that I used STDOUT for the output instead of
including the unneeded and slightly painful (for readers of the
post) and superfluous (to the question being asked) output file.
  And that I used the <DATA> filehandle to read a definition for
@lines. I also removed the \n at the end of your "This line
contains..." print statement, since $line already contains the \n
read from the input.

There are also trickier ways of doing what you want, such as
defining the input record separator to be the
"document_starts_here" string. Something like:

use warnings;
use strict;
my @lines;
{
    local $/="<!--document_starts_here-->\n";
    @lines=<DATA>;
}
print $lines[1];
__END__
This line should be ignored
<!--document_starts_here-->
This is line 3
line 4
last line

HTH.

-- 
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl
----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= East/West-Coast Server Farms - Total Privacy via Encryption =---


Relevant Pages