Re: What kind of tcl tools would help me parse and use html info?
- From: "Gerald W. Lester" <Gerald.Lester@xxxxxxx>
- Date: Fri, 24 Mar 2006 12:19:07 -0600
Larry W. Virden wrote:
I know others have replied, but...
I have a need to write a tool to do this:
fetch an html http URL
Use the http pacakge
parse the html
I'd use htmlparse package from TclLib with the -cmd option
Look through the A tags for some specific phrases
The routine you specify in the ::htmlparse::parse via the -cmd option will be called for every tag, just check to see if the tag is an A.
For each one found, check a file cache. If the URL associated with the
tag is in the cache, see if it has been modified since it was placed
into the cache.
file mtime, clock scan and string equal
If not, continue.
If it has been modified, or if it doesn't exist in the cache, then
fetch the URL,
Again use the http package
place into the cache, and touch to make the cache copy
have the date and time from the web site.
file mtime $filename $webDateTime
For one of the specific phrases, instead of caching the file, treat it
as the next html to parse and search.
Put the above in a proc and recursively call it.
When one specific term is no longer found, application is finished.
The stack unwinds and you exit.
The only other possible thing for the algorithm above is that one of
the URLs is the URL of a CGI with values. The other URLs are just
static HTML pages.
What are some examples using some of the Tcl tools for parsing that
fetched file and searching the A tags for phrases?
Take a look at the htmlparse.test on tcllib.sf.net
--
+--------------------------------+---------------------------------------+
| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|
+------------------------------------------------------------------------+
.
- References:
- What kind of tcl tools would help me parse and use html info?
- From: Larry W. Virden
- What kind of tcl tools would help me parse and use html info?
- Prev by Date: tkdnd installation
- Next by Date: Re: tkdnd installation
- Previous by thread: Re: What kind of tcl tools would help me parse and use html info?
- Next by thread: wiki incr tcl erased
- Index(es):
Relevant Pages
|