What kind of tcl tools would help me parse and use html info?



I have a need to write a tool to do this:

fetch an html http URL
parse the html
Look through the A tags for some specific phrases
For each one found, check a file cache. If the URL associated with the
tag is in the cache, see if it has been modified since it was placed
into the cache. If not, continue.
If it has been modified, or if it doesn't exist in the cache, then
fetch the URL, place into the cache, and touch to make the cache copy
have the date and time from the web site.
For one of the specific phrases, instead of caching the file, treat it
as the next html to parse and search.
When one specific term is no longer found, application is finished.

The only other possible thing for the algorithm above is that one of
the URLs is the URL of a CGI with values. The other URLs are just
static HTML pages.

What are some examples using some of the Tcl tools for parsing that
fetched file and searching the A tags for phrases?

.



Relevant Pages

  • Re: parsing in python
    ... > a text from an oracle database that contains different tags that have to ... > texts in Python? ... > to increase the readability of the generated HTML source. ... parse text strings in Python. ...
    (comp.lang.python)
  • Re: Parsing DOM with Javascript
    ... >>My problem is that i need an algorithm parse parse HTML. ... implies simply parsing HTML tags. ... an array element. ...
    (comp.lang.javascript)
  • Re: What kind of tcl tools would help me parse and use html info?
    ... fetch an html http URL ... Look through the A tags for some specific phrases ... For each one found, check a file cache. ... For one of the specific phrases, instead of caching the file, treat it ...
    (comp.lang.tcl)
  • Re: XML in XHTML
    ... document as html elements. ... If I parse this, the title element cannot be extracted and the page ... The tag inside HTML is a Microsoft extension and although other ... contents as anything other than more HTML tags to parse. ...
    (comp.lang.javascript)
  • Re: XML in XHTML
    ... > My problem is that javascript is understanding the nodes in my xml ... I take it you have embedded this as an 'xml data island' inside your HTML. ... contents as anything other than more HTML tags to parse. ...
    (comp.lang.javascript)