RE: extract web pages from a web site




I recommend Lincoln Stein's book "Perl Networking".

Even if you are too cheap to buy his book, you can google for it and
download the source code for an example program that uses HTML::Parser to
extract and download all the gif files from a page. His example actually
parses the HTML and it sounds like you are not interested in that part.

I looked at WWW::Mechanize and was dismayed because it looked like it was
extremely specific. It only had a few functions and was not general purpose.

Siegfried


-----Original Message-----
From: Scott R. Godin [mailto:nospam@xxxxxxxxxxxxx]
Sent: Wednesday, September 14, 2005 9:33 PM
To: beginners@xxxxxxxx; jose.pinto@xxxxxxxxxx
Subject: Re: extract web pages from a web site

José Pedro Silva Pinto wrote:
> Hi there,
>
> I am doing a program in perl to extract some web pages (And copy it to a
local file), from a given web address.
>
> Which perl module can I use to help me to do this task

It depends on what you're looking to do...

LWP::Simple to grab stuff with, WWW::Mechanize and HTML::TokeParser or
HTML::Parser.. to interact with it and pick apart the results..

If you simply want to download and store the webpage wouldn't you also want
to
store the attendant image/css/javascript/embedded files that it references
externally ?

--
To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx
<http://learn.perl.org/> <http://learn.perl.org/first-response>



.



Relevant Pages

  • ~~> DOWNLOAD PERL <~~
    ... download perl for windows 2000 ... perl debugger gui windows download trial ... download rrd perl module red hat ...
    (rec.travel.europe)
  • >>> DOWNLOAD PERL <<<
    ... eclipse perl plugin download ... download perl 5.8.1 on windows ... perl module auto download file ...
    (sci.engr.semiconductors)
  • >>>> PERL DOWNLOAD <<<<
    ... active perl download, active perl download 5.8, active perl for ... windows download, active perl free download, activestate perl download ... module, download perl module uri, download perl modules, download perl ...
    (sci.math)
  • ANNOUNCE: Archive::Zip 1.13 released
    ... don't use Perl any more, ... files), extract a file, and then re-write the zip, you will get a ... Members can be created from members in existing ... - FAQ NAME fix from Michael Schwern ...
    (comp.lang.perl.modules)
  • RE: Optimization for faster select...
    ... Although its a cute idea to create a function based index on SUBSTR ... I tried to see whether the SUBSTR could be replaced with Perl formatting ... the possibility of writes delaying the extract. ... So, ladies and gets, everything comes down to using RowCacheSize attribute. ...
    (perl.dbi.users)