Re: Pulling out data between <TD> tags using regular expressions
- From: Eric Schwartz <emschwar@xxxxxxxxx>
- Date: Thu, 26 May 2005 16:35:48 -0600
tdmailbox@xxxxxxxxx writes:
> If I had this tag and wanted to return 123 how would I do it? I have
> tried countless methods but can not get the only the 123 without the
> <TD> tags
>
> <TD class=tblform3 id=L_listing width=23>123</TD>
>
> After 3 hours I am giving up and asking the experts.
If you'd asked your computer, you'd have had the answer much faster:
perldoc -q HTML
And the first returned result is:
"How do I remove HTML from a string?"
Which is exactly what you need. If you get in the habit of searching
your local documentation first, then you'll get better answers faster,
as you won't have to wait for an answer here, and also the people who
can give you the best answers to your questions are tired of answering
them all the time, which is why they wrote the FAQ in the first place!
So if you ask FAQs here, then you will by definition only get the
less-experienced people answering your questions, as a rule.
But I'm feeling generous, also I'd been meaning to poke at
HTML::Parser for a while anyhow. So I whipped up this little example:
#!/usr/bin/perl
use warnings;
use strict;
use HTML::Parser ();
sub start_handler
{
return if shift ne "td";
my $self = shift;
$self->handler(text => sub { print shift }, "dtext");
$self->handler(end => sub { shift->eof if shift eq "td"; },
"tagname,self");
}
my $p = HTML::Parser->new(api_version => 3);
$p->handler( start => \&start_handler, "tagname, self" );
$p->parse( <<EODATA );
<TD class=tblform3 id=L_listing width=23>123</TD>
EODATA
print "\n";
__END__
For future reference, if you have a problem, you're going to get the
best results here if you can create an example of it that looks
something like that-- short (I went to 21 lines, and that's about as
big as I try to let them get), complete, and clearly state what is
happening, and how that differs from what you wanted to happen.
Also, note that the above example stops parsing after the first </TD>;
if you are going to parse text containing multiple TD elements, you'll
want to read the HTML::Parser documentation to find out better ways of
doing that.
-=Eric
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
.
- Follow-Ups:
- Re: Pulling out data between <TD> tags using regular expressions
- From: Gunnar Hjalmarsson
- Re: Pulling out data between <TD> tags using regular expressions
- References:
- Pulling out data between <TD> tags using regular expressions
- From: tdmailbox
- Pulling out data between <TD> tags using regular expressions
- Prev by Date: Re: IO::Socket: buffering error ?
- Next by Date: Net::OSCAR send_im timing problems
- Previous by thread: Re: Pulling out data between <TD> tags using regular expressions
- Next by thread: Re: Pulling out data between <TD> tags using regular expressions
- Index(es):
Relevant Pages
|