html parsing



I would like to retrieve a page from one of my favorite sites. The section I am interested in starts with a html header (in this case h2), followed by a table, with all the html formatting mixed in (fonts, spans, etc.). Is there an easy way to pull out just the h2 header and convert the table so each row becomes a tcl list?

I am experimenting with tDom but, it is hard to see what I should look for and what I should ignore.


.