Testing for changes on a web page (was: how to find difference in number of characters)
- From: Stefan Behnel <stefan_ml@xxxxxxxxx>
- Date: Sat, 09 Oct 2010 14:41:27 +0200
harryos, 09.10.2010 14:24:
I am trying to determine if a wep page is updated by x number of
characters..Mozilla firefox plugin 'update scanner' has a similar
functionality ..A user can specify the x ..I think this would be done
by reading from the same url at two different times and finding the
change in body text.
"Number of characters" sounds like a rather useless measure here. I'd rather apply an XPath, CSS selector or PyQuery expression to the parsed page and check if the interesting subtree of it has changed at all or not, potentially disregarding any structural changes by stripping all tags and normalising the resulting text to ignore whitespace and case differences.
Stefan
.
- Follow-Ups:
- References:
- how to find difference in number of characters
- From: harryos
- Re: how to find difference in number of characters
- From: Peter Otten
- Re: how to find difference in number of characters
- From: harryos
- Re: how to find difference in number of characters
- From: Peter Otten
- Re: how to find difference in number of characters
- From: harryos
- how to find difference in number of characters
- Prev by Date: Re: how to find difference in number of characters
- Next by Date: Re: Testing for changes on a web page (was: how to find difference in number of characters)
- Previous by thread: Re: how to find difference in number of characters
- Next by thread: Re: Testing for changes on a web page (was: how to find difference in number of characters)
- Index(es):
Relevant Pages
|