Re: LWP module - parse one line at a time (only download part of a page)



"nobull@xxxxxxxx" <nobull67@xxxxxxxxx> wrote:
> Alf McLaughlin wrote:
>
> > I want to download a fairly large amount of data from a webpage
> > (~10MB), but the stuff I'm really interested in is always toward the
> > top of the page (however, I don't know exactly where). Since I'm only
> > interested in two or three lines, I don't want to download the whole
> > page. I would like download until I see what I want (such as my
> > $current_line =~ /WHAT I WANT/) and then kill the download.
>
> Read the description of the get() method of LWP::UserAgent.

I think you mean request() rather than get().

> In particular note the existance of the callback and the bit where it
> says "The callback can abort the request by invoking die()."

This method is the direct answer to the OPs question, but he will have to
be careful to account for the chance that his desired string will span a
chunk boundary.

I think a simpler but less rigorous option would be to set the
$ua->max_size to his best guess of a upper limit on how far into the
response the desired string can be. But there is always the danger that
the upper limit turns out to be set too low, and you miss things that the
callback method would find. Of course, there is the corresponding hazard
that the guess will be set too high, and he will still be reading far more
data than necessary.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
.



Relevant Pages

  • Re: URLDownloadToFile
    ... >> To avoid hang of GUI, if to make use of the URLDownloadToFile() ... >> callback, inside the callback, peek and dispatch messages waiting in ... How does URLOpenPullStream"control the amount of Internet access for the ...
    (microsoft.public.vc.language)
  • Re: Threads and notification when finished
    ... Raising events from the thread would be a typical way. ... > async callback couldn't be used in the cf 2.0. ... It depends on your internet connection how long a download ... is a special thread running or not. ...
    (microsoft.public.dotnet.framework.compactframework)
  • Re: Progress Bar for Download
    ... back after Sunday to see if Stuart was able to post another option. ... the program is available, download and install it. ... It will call a callback ... Not worth the trouble for a simple 'busy' dialog. ...
    (microsoft.public.access.modulesdaovba)
  • Re: Partial download with ftplib and retrbinary
    ... I have a callback ... with retrbinary that raises an exception and ends the download. ...     global count,localfile,number ... The server hangs on because the data connection is left open. ...
    (comp.lang.python)
  • Re: Tk and LWP?
    ... LWP::UserAgent for this but the problem is then when I download a huge file ... LWP supports a progress callback(shown below), ... But if you really want a smooth gui, put the lwp stuff in a thread. ...
    (comp.lang.perl.tk)