Re: Downloading lots and lots and lots of files
- From: Ted Zlatanov <tzz@xxxxxxxxxxxx>
- Date: Mon, 29 Jan 2007 15:33:25 -0500
On 29 Jan 2007, coolneo@xxxxxxxxx wrote:
Google is kinda odd sometimes. It took them forever to allow multiple
download streams, and then they provide this web interface to recall
data in text format with wget. I mean, for Google, you figure they
could do better. I think they would prefer to not give us anything at
all. Once we have it there is always the chance we'll give it way or
lose it or have it stolen (by Microsoft!).
As a business decision it may make sense; technically it's nonsense :)
At the very least they should give you a rsync interface. It's a
single TCP stream, it's fast, and it can be resumed if the connection
should abort. HTTP is low on my list of transport mechanisms for
large files.
Another thing I didn't mention is that this can grow to much larger
than the 50,000, in which case, I'd much rather just auto-download,
than deal with media.
Sure. I was talking about your initial data load; subsequent loads
can be incremental.
I would also suggest limiting to N downloads per hour, to avoid bugs
or other situations (unmounted disk, for example) where you're
repeatedly requesting all the data you already have. That's a very
nasty situation.
Ted
.
- Follow-Ups:
- Re: Downloading lots and lots and lots of files
- From: coolneo
- Re: Downloading lots and lots and lots of files
- References:
- Downloading lots and lots and lots of files
- From: coolneo
- Re: Downloading lots and lots and lots of files
- From: Purl Gurl
- Re: Downloading lots and lots and lots of files
- From: coolneo
- Re: Downloading lots and lots and lots of files
- From: Ted Zlatanov
- Re: Downloading lots and lots and lots of files
- From: coolneo
- Downloading lots and lots and lots of files
- Prev by Date: FAQ 3.20 How can I hide the source for my Perl program?
- Next by Date: Re: html tags and perl
- Previous by thread: Re: Downloading lots and lots and lots of files
- Next by thread: Re: Downloading lots and lots and lots of files
- Index(es):