Re: backreferneces in search pattern

From: Bob Walton (see_at_sig.invalid)
Date: 11/12/04


Date: Thu, 11 Nov 2004 22:49:11 -0500

Marek Stepanek wrote:
...
> I try to set up a Perl-Filter (for BBEdit on Macintosh). I want to match and
> print out all gifs needed for a rollover in an HTML-File. The filter should

Unless you control the layout of the HTML, you would be much better off
to parse your HTML using a HTML parser module (HTML::Parser, perhaps).
Correctly parsing HTML is much harder than it appears at first glance.

> match
>
> onmouseover="document.podium.src='../../pix/grafix/hili_podium.gif';
>
> but also beasts like the following :
>
> onmouseover="document.podium.src='../../pix/logos/theembassy1.gif';
> document.bios.src='../../pix/logos/theembassy2.gif';
> document.addresses.src='../../pix/logos/theembassy3.gif';
> document.events.src='../../pix/logos/theembassy4.gif';
> document.priwat.src='../../pix/logos/theembassy5.gif';
> document.yrpodium.src='../../pix/logos/theembassy6.gif';
> document.logo.src='../../pix/logos/hili_pilogo.gif'"
>
>
> the result should print from the above examples :
>
> '../../pix/grafix/hili_podium.gif',
> '../../pix/logos/theembassy1.gif',
> '../../pix/logos/theembassy2.gif',
> '../../pix/logos/theembassy3.gif',
> '../../pix/logos/theembassy4.gif',
> '../../pix/logos/theembassy5.gif',
> '../../pix/logos/theembassy6.gif',
> '../../pix/logos/hili_pilogo.gif',
>
>
> I am labouring since a long while already on this filter. Could somebody
> help me out ? I am beginner and want learn Perl very much.
>
> This filter here produces the following error :
>
>
> Modification of a read-only value attempted <> line 1.
>
>
> (mind line breaks from my email client)
>
>
> #!/usr/bin/perl -w
>
You are missing:

    use strict;
    use warnings;

Let Perl help you all it can. I see you did use the -w switch -- but
these days it is better to

    use warnings;

because of the additional control over the warnings it affords. See
below for what I am talking about.

>
> while (<>) {
> ($1, $2, $3) =
-------^^--^^--^^

Those are your readonly variables that you are attempting to modify. See

    perldoc perlvar

particularly the section about $<digits> . Note that the $1, $2, etc
variables are assigned by simply placing capturing sets of parentheses
in a regular expression. Judging from your next couple of lines of
code, you probably want:

    my @results =

on this line, and then remove the "my @results=($1,$2,$3);" line.

> m!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'([
> ^']+)')?"!g;
> my @results = ($1, $2, $3);
> foreach $i (0..$#results) {
> print "'", $i, "',\n";
> }
> }
>
> My question is not about the grep search pattern, but how to put the
> backreferences into an array to print it out.
>
> one other try gives :
>
> Use of uninitialized value in print <> line 2.

Note that that is a warning, and is not fatal. Your program should
continue to execute normally.
>
>
> #!/usr/bin/perl -w
>
> while (<>) {
> while (
> s!onmouseover="[^']+'([^']+)'(?:(?:(?:;\s+[^']+)'([^']+)')+)?(?:(?:[^']+)'(
> [^']+)')?"!!g ) {print "'", $1, "',\n"; print "'", $2, "',\n"; print "'",
> $3, "',\n";}
> }

The uninitialized value might have come from a capturing parentheses
pair that matched an empty string because it was inside the scope of a
"?" metacharacter and the "no occurrences" branch was taken. Prior to
the match, all the $1, $2 etc variables are set to undef, and are only
modified upon their corresponding set of parens actually being used in
the final match. This particular warning can be much less than totally
useful in a situation like this, which is why it can be suppressed in a
block with a "no warnings qw(uninitialized);" statement at its start.
The means something like:

   while(<>){
      no warnings qw(uninitialized);
      s!onmouseover="...
   }

But you can only do that if you

    use warnings;

rather than the -w switch.

> marek

-- 
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl


Relevant Pages

  • Re: A toughy.
    ... Do I ignore all those warnings and using add and remove just delete the darned thing? ... I have always used HTML for my personal posts and the HTML doesn't seem to ... Bug 2. ... Paying users do not have this message in their emails. ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • Re: Variable remaining undef in one place but not another.
    ... the cwd on to part that starts the HTML, so that the cwd can be used in ... the HTML title tag. ... use warnings; ... sub htmlStart{ ...
    (comp.lang.perl.misc)
  • Re: converting linebreaks to br
    ... > Level of HTML: HTML 4.01 Transitional ... Uncheck the "Include warnings" checkbox ... But, without fail, such markup doesn't mean what the author thought ... The reason for the trailing ">" is because HTML permits shorthand ...
    (comp.lang.php)
  • Re: DocumentHTML ?
    ... automated downloading of about 10 GB of HTML from various web sites ... use warnings; # do not leave it out. ... to set your security settings to a low and possibly unsafe level. ...
    (comp.lang.perl.misc)
  • Re:backreferneces in search pattern
    ... > Correctly parsing HTML is much harder than it appears at first glance. ... > use warnings; ... > because of the additional control over the warnings it affords. ... "use strict" I should put systematically specially as a beginner! ...
    (comp.lang.perl.misc)