Parsing HTML Tables
- From: srikpen@xxxxxxxxxxx (Sri Pen)
- Date: Sat, 30 Apr 2005 13:17:47 +0530
my( @tableRows, @tableDefRows);
$htmlContent = s/\n//g; # $htmlContent Contains the below sample
<table><tbody> <tr align="center><td align="center">4321191</td>
<td align="center"><a target="_blank"
href="http://mail.yahoo.com/config/login?/pls/bug/webbug_util.show_bug_user?
p_userid=YHIRA">YHIRA</a></td>
<td align="center">Bug 432119 - WHEN CREATING DATABASE</td>
<td align="center">11 - Code Bug (Response/Resolution)</td>
<td align="center"><a target="_blank"
href="http://mail.yahoo.com/config/login?/pls/bug/webbug_util.show_bug_user?
p_userid=SIHL">SIHL</a> </td>
<td align="center">Medium</td></tr><tr>.........</tr></tbody></table>
@tableRows = ($htmlContent =~ /(<tr.*?>*.?>YHIRA<.*?<\/tr>)?/isg );
for ( @tableRows ) { @tableDefRows = ($_ =~
/(<td.*?>*.?4321191*.?<\/td)?/isg );
}
Refering to the code above. Why doesn't $tableRows[0] match the data between
<tr*>..>YHIRA<..</tr>?It matches all the data between
<TABLE><TBODY>...</TBODY>
and second @tableDefRows match the data between <TD*>..4321191..</TD>It
matches all the data between <TABLE><TOBDY><TR*>...</TR></TBODY></TABLE>
something is wrong here. Do I need to some how start from <TABLE> and my
match all the way tofirst </td> and use some backtracking or something?
thanks much.
.
- Prev by Date: Re: REGEXP removing - il- - -b-f and - il- - - - f
- Next by Date: to fork process and use FIFO.
- Previous by thread: problems with SerialPorts, Modems, IOCTL
- Next by thread: to fork process and use FIFO.
- Index(es):
Relevant Pages
|
|