Re: LWP get but only get the first 25% then exit/stop_retrieveing



Al wrote:

> Purl Gurl wrote:
> > Al wrote:
> > > Purl Gurl wrote:
> > > > Purl Gurl wrote:
> > > > > Purl Gurl wrote:
> > > > > > Al wrote:

(snipped)

> And both lynx and wget are failing me whilst Perl comes through.

> The page to be scraped appears to have some variables that fail to
> interpolate/dump_content when I use lynx or wget. But I do get the
> full web page content extrapolated when I use LWP::Simple

> getprint("http://www.wrh.noaa.gov/total_forecast/index.php?wfo=sto&zone=caz017&fire=caz217&county=cac067";);

For weather reports, you will discover this site to be reliable and easy to scrape.
Rather than scrape this site, I simply use their free linkable weather report "gizmo."

http://www.wunderground.com/geo/BannerPromo/US/CA/Sacramento.html

http://banners.wunderground.com/weathersticker/big2_cond/language/www/US/CA/Sacramento.gif

Take a look at those two links, then cruise the rest of the site. You might like those
gizmo gadgets and elect to simply use their pre-written html code in a page of yours.
No need for scraping, no need for a script; a weather report automatically appears
within a very attractive graphic which updates every five minutes, automagically.

Purl Gurl
.



Relevant Pages

  • Re: LWP get but only get the first 25% then exit/stop_retrieveing
    ... >> Purl Gurl wrote: ... Very wise of you to dump Perl ... But perhaps Perl better emulates a modern web browser than either lynx ... option with wget. ...
    (perl.beginners)
  • Re: PHP script that fills forms ?
    ... While wget knows about HTTP authentication, ... some smart automation you could somehow make lynx du what you're after. ... > I was just wondering if it's possible to get a PHP script to fill a form for ... I'm trying to make PHP fetch a password protected website for ...
    (php.general)
  • Re: alias/script to access Google archives? (was Re: A question about windoze trolls)
    ... using wget to retrieve this URL shows Google responds to that query ... But wget fails to retrieve the URL after the redirect ... Lynx Version 2.8.5rel.1 ...
    (comp.os.linux.misc)
  • Re: Archiving an authors postings [OT]
    ... and can be coupled to shell scripts to do quite complex things. ... True enough, although I prefer wget for automated webpage downloading, ... I presume lynx will save pages with html stripped out? ... and wget to download to files could do it nicely. ...
    (rec.crafts.metalworking)
  • Re: passing xterm parameters
    ... > i have several lynx sessions that are being piped to text files, ... > as a cron job without having to open an xterm session. ... Why not just use wget instead of lynx, you can set wget up with recursive downloading ...
    (Debian-User)