Re: web crawling for books



On Nov 25, 9:58 am, "alexxx.ma...@xxxxxxxxx" <alexxx.ma...@xxxxxxxxx>
wrote:
I have a large list of my library's books,
and I would like to setup a Perl spider, going on the web for each
author/title information, and returning useful info I didnt put into
the records (editor, year, topic, isbn, ...).
I already wrote down the basic spider's structure, but I'm not sure
which site is more apt to such a search (considering also that its
robots.txt should allow me access).
Which site would you suggest for such a task?

Thank you!

Alessandro Magni

Hi,

speaking from experience, I think you will be able to obtain higher
quality results which are more relevant using API's instead of just
scraping sites. For example, check out the Amazon Web Services API at
http://www.amazon.com/AWS-home-page-Money/b?ie=UTF8&node=3435361
You could also potentially use http://books.google.com/.

Spiros
.



Relevant Pages

  • Re: web crawling for books
    ... and I would like to setup a Perl spider, going on the web for each ... and returning useful info I didnt put into ...
    (comp.lang.perl.misc)
  • Re: web crawling for books
    ... and I would like to setup a Perl spider, going on the web for each ... and returning useful info I didnt put into ... structured data, which will be far easier to deal with than spidering sites ...
    (comp.lang.perl.misc)
  • setting up Epson LX-800 dotmatrix on FC4
    ... I have googled a lot about this but didnt get a ... satisfactory solution. ... I used kde gui utility to setup my printer but LX-800 ...
    (Fedora)
  • Re: Backup solution suggestions [ggated]
    ... testing during the night with the above setup. ... What I dont have is a coredump, judging from dmesg -a savecore wasnt ... running it now, 5 hours later, didnt find any cores. ... bandwidth graphs from the switch, ...
    (freebsd-stable)
  • help: Visual Basic on xp embedded?
    ... But it didnt run. ... used setup by package and deploy wizard. ...
    (microsoft.public.windowsxp.embedded)