Re: web crawling for books
- From: Spiros Denaxas <s.denaxas@xxxxxxxxx>
- Date: Sun, 25 Nov 2007 05:59:41 -0800 (PST)
On Nov 25, 9:58 am, "alexxx.ma...@xxxxxxxxx" <alexxx.ma...@xxxxxxxxx>
wrote:
I have a large list of my library's books,
and I would like to setup a Perl spider, going on the web for each
author/title information, and returning useful info I didnt put into
the records (editor, year, topic, isbn, ...).
I already wrote down the basic spider's structure, but I'm not sure
which site is more apt to such a search (considering also that its
robots.txt should allow me access).
Which site would you suggest for such a task?
Thank you!
Alessandro Magni
Hi,
speaking from experience, I think you will be able to obtain higher
quality results which are more relevant using API's instead of just
scraping sites. For example, check out the Amazon Web Services API at
http://www.amazon.com/AWS-home-page-Money/b?ie=UTF8&node=3435361
You could also potentially use http://books.google.com/.
Spiros
.
- References:
- web crawling for books
- From: alexxx.magni@xxxxxxxxx
- web crawling for books
- Prev by Date: Re: web crawling for books
- Next by Date: FAQ 3.18 How can I free an array or hash so my program shrinks?
- Previous by thread: Re: web crawling for books
- Next by thread: Re: web crawling for books
- Index(es):
Relevant Pages
|
|