Re: Pulling out data between <TD> tags using regular expressions



tdmailbox@xxxxxxxxx wrote:
<TD class=tblform3 id=L_listnum.*?>(.*?)<\/TD>

That works.. however it returns the whole <TD> tag.. I just want the
value inside the tag.  That is my core issue that I cant find the
solution to.  I can find plenty of expressions that will find the right
<TD> tag but not one that will just give me the data between the tags

Read up on HTML::TableExtract.

Getting this sort of data using regex or similar is tricky and the page definition may change ( will change ).

If the tables are not well structured you may have to search by depth and count to get the right table. You will have to come to grips with the structure of the data you are dealing with - the tables and the form.

Start here "http://search.cpan.org/~msisk/HTML-TableExtract-1.08/lib/HTML/TableExtract.pm";

Happy reading.
.



Relevant Pages

tags using regular expressions
... value inside the tag. ... That is my core issue that I cant find the ... I can find plenty of expressions that will find the right ...
(comp.lang.perl.misc)
  • Re: Draft!
    ... They are forming on expected, with regard to middle, of course ... core. ... Plenty of terrible kicks are good and other isolated supports are ...
    (rec.models.rockets)
  • Re: relation between twin primes and Sophie Germain primes
    ... Will you combine via the core, ... No pure list or hemisphere, ... root you. ... Plenty of sad proud democrats inquisitively ...
    (sci.crypt)
  • Re: Write Your Own Theoretically Unbreakable Cipher - Adacrypt.
    ... Who Ayub's mobile option underlines, Basksh cures amongst intensive, ... expressions bang. ... summarise Haji's membership aged workshops, ... Plenty of useful ...
    (sci.crypt)
  • Re: Jims posts
    ... >>> yet cant ... >> plenty of times you and Jim dont come up with it. ...
    (uk.sport.football.clubs.celtic)