Re: I have no problems eating cereal...after it softens. Why is replacing a simple string so hard then?




Comments inline jjjjjjjjjjjjj: comment

Tad McClellan wrote:
samiam@xxxxxxxxxxxxxxx <samiam@xxxxxxxxxxxxxxx> wrote:

But I am just beginning Perl,


Then you should probably ask for all the help that you can get.

Put

use warnings;
use strict;

at the top of every program you write, and perl will find
many of your bugs for you.


Have you read a basic tutorial such as "Learning Perl" yet?

See also http://learn.perl.org


My task is sooo deceptively simple: Just replace a simple string with


(your strings are not as "simple" as you think.)


another string. How hard could that be?

My data file is here: http://home.comcast.net/~tankomail/preg.htm
And a sample is at the very bottom of this post. I just want to replace
/<form[.*]?*\/form>/ with the word "block"


That is not a "simple string". That is markup. A robust solution
requires a Real Parser rather than a pattern match (which is only
good for a dirty hack).

jjjjjjjjjjjjjjj: Well, with all the examples I've read in the tutorial,
ie perdocs, I felt drunk with power and thought regexes could easily
handle any string multi-line or not.


Why did you include the square brackets?

They are not doing what you think they are doing.


jjjjjjjjjjjjjjjjjj: I just love this line - "They are not doing what
you think they are doing."

I thought the brackets isolated that inside them as a whole to apply to
the first precedent. I thought I read this in the perl docs.





Basically I just want to replace all <form> </form> fields and
everything in between with nothing, but in testing, I wanted to see my
work so I chose the word "block" as a good simple substitute which I
could then replace with nothing.


$orgtext = Whey; # this one right here


Is that a string with no quotes, or is that a function call?


jjjjjjjjjjjjjjjjjjjjjjjj: String no quotes.



You should put quotes around your strings. If you had a function
named Whey() defined, then it would be called here and its return
value would be stored in $orgtext.

jjjjjjjjjjjjjjjjjjjjjj: I know... but in my tiny patch work programs
with no functions and only a smidgen of lines...I am likely to be as
sparse and functional as possible if their are no negative rams.
No...not male goats with bad attitudes.



"use strict" will enforce putting quotes around your strings.


$newtext = Popcorn;

The above works. I reduced it to it's simplest form as a sanity check.
Then I tried:

$orgtext = /[Ww]hey/; # this one right here


"use warnings" would have helped you here. I expect you meant
this instead:

$orgtext =~ /[Ww]hey/;


jjjjjjjjjjjjjjjjjjjjjjjjjj: Well...actually I was using the assign
operator deliberately to pop the whey into the $orgtext. My pre/post
processing of $orgtext I think is what's lacking.



^
^

$newtext = Popcorn;

But beyond the most primitive replacement, I invariably get:

Use of uninitialized value in pattern match (m//) at
C:\russ\scripts\_Master_Snippets\clean_2_input_output_file.pl line 9.


Oh. You _are_ using warnings.


jjjjjjjjjjjjjjjjjjjjjjj: See? I am a pro! Want to hire me to code the
next space shuttle launch?




The uninitialized value is in $_.

Since you don't have a proper binding operator (=~) the pattern
is NOT attempting to match against the string in $orgtext, it
is trying to match the string in $_, and it appears you do not
have a string in there.

jjjjjjjjjjjjjjjjjjjjjj: but since I was not trying to match...to
assign.....


You inserted a bug, and "use warnings" found it for you!


Eventually I want to try:

$orgtext = /<form[.*]?*\/form>/; # this one right here
$newtext = block;

But I can't get past the staring blocks. I know this code works in
general,


There is no way this code works. It has multiple errors in it.


It's someone else's and with simple code, ie. basic words....it worked
great for me. It was just in the modifying to parse multi-line html
that I had the problem. But from my reading, I understand a dedicated
parser is best, but for a tiny little job like this, and considering
the awesome might and claims and indeed history of plain regex, I saw
no reason I couldn't just use this code to do the task.





Any suggestions as to:

1.) Is my basic model okay, slurping the whole file into a variable? or
2.) Should I use a while <> structure?


That depends on whether or not you need to match across multiple lines.

If you _need_ multiple lines then slurping is good, else line-by-line
is better.


jjjjjjjjjjjjjjjjjjjjjjjjjj: Yup... defintely need multiple lines since
the <form> tags span quite a few lines and characters.

I am surprised there isn't just a:

Start at <form
read until one gets to </form> regardless of /n /t or anything else
remove all of it.

If this is not simple...it should be. Maybe I am talking like an
idyllic martian...but this kind of basic read until string and remove
it all function should be added if not extant now.




And even when I do get the simple Whey replaced with Popcorn - it only
does the first instance, basically, I am guessing, because there is no
iterative code in this script.


You should read about the pattern match operator in perlop.pod if
you are having trouble using the pattern match operator.

It tells how to make a pattern find all occurrences.


Hmmm....I will find this mystical all occurance switch. I just wish
Master Regex 3rd addition was more reference like and less anecdote
chatty. Finding a specific piece of info in that Arthur Miller Play is
daunting.






Your input and examples are GREATLY appreciated because the red spot on
my banging against the cubicle wall head is growing.


----------------------------
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple;

my $html = get 'http://home.comcast.net/~tankomail/preg.htm';

$html =~ s#<form .*?</form>#BLOCK#sg; # use alternate delimiters

print $html;
----------------------------


--
Tad McClellan SGML consulting
tadmc@xxxxxxxxxxxxxx Perl programming
Fort Worth, Texas

.



Relevant Pages