Re: problem reading html stream SOLVED



Dave Saville <dave@xxxxxxxxxxxxxxx> wrote:
On Sun, 15 Jan 2012 15:09:39 UTC, "Peter J. Holzer"
<hjp-usenet2@xxxxxx> wrote:

On 2012-01-15 14:01, Dave Saville <dave@xxxxxxxxxxxxxxx> wrote:
On Sun, 15 Jan 2012 12:16:08 UTC, "Peter J. Holzer"
<hjp-usenet2@xxxxxx> wrote:

<snip>

Do you use HTTP to get the data or some custom protocol?

HTTP - But it would appear to be a problem with perl sockets - Someone
suggested LWP::Simple but that was no good as I needed to process the
files which are large and the server does not have much RAM. So I used
LWP::UserAgent to dump straight to a file which I can then post
process and it works fine. Odd as I would have thought that LWP* would
use sockets at the bottom layer. Ho hum.

It does. You probably made an error in writing your own HTTP
implementation.

That I am willing to believe. Perhaps you would be so kind as to point
out the error in my code?

#!/usr/local/bin/perl
use warnings;
use strict;
use Socket;
open RAW, ">RAW" or die $!;
my $iaddr = inet_aton('xmltv.radiotimes.com') or die $!;
socket(SOCK, AF_INET, SOCK_STREAM, getprotobyname('tcp')) or die $!;
my $paddr = sockaddr_in(80, $iaddr);
connect(SOCK, $paddr) or die $!;
send SOCK, "GET /xmltv/94.dat HTTP\/1.1\r\n", 0;
send SOCK, "Host: xmltv.radiotimes.com\r\n\r\n", 0;
while ( <SOCK> )
{
print RAW $_;
}
close SOCK;
close RAW;

This hangs for minutes and then completes. I have run the above on two
different operating systems and they both do exactly the same.

This 180 kB look suspicously like the length of the file the
server sends. And you're using HTTP 1.1, which allows the sender
to keep the connection open after it has send a file, waiting
for the next request unless told otherwise ("persistent connec-
tion" is actually the defalt with HTTP 1.1). So my guess is that
the server sends the complete file just fine and waits for the
the next request. But since your loop only ends when the connec-
tion is closed by the other side it hangs until the server gets
bored and closes the connection after a few minutes. So either
use HTTP 1.0 or send an additional HTTP header with (IIRC)
"Connection: close\r\n". See also e.g.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html

Regards, Jens
--
\ Jens Thoms Toerring ___ jt@xxxxxxxxxxx
\__________________________ http://toerring.de
.



Relevant Pages

  • Re: Banana Republic (was Re: OpenVMS Book Wins award)
    ... client ... No bollocks HTTP, SOAP, XML, Java, Garbage ... receive messages from any number of server processes who in turn could be ... Unlike TCP/IP and/or UDP Sockets with Java that have been around since ...
    (comp.os.vms)
  • Re: RPCoHTTP always has 4 failures
    ... ON fast networks, connect using HTTP first, then connect using TCP/IP ... Outlook doesn't even prompt me to log in, it just says the server is ... Warning If you use Registry Editor incorrectly, ... It always shows HTTPS as the protocol in the connection ...
    (microsoft.public.exchange.admin)
  • Re: rpc over http connection problem
    ... HTTP Error 403.2 - Forbidden: ... it is important that you don't get any warning or etc. for certificate. ... >> i have single exch2003sp1 on w2003ent server. ... but the status keeps on remaining trying connection. ...
    (microsoft.public.exchange.connectivity)
  • Re: HTTP DDoS attack on our servers
    ... Server administration, security, programming, consulting. ... HTTP DDoS attack on our servers ... > handle HTTP requests at all and immediately closed the connection after ...
    (Incidents)
  • Re: .net 2 and ADOMD
    ... I now get the same problem when I try to retieve catalogs from a remote ... TCP but rmote TCP connection takes roughly 20 seconds. ... I will alter our code to be configuarable to use TCP or HTTP. ... or use a direct connection to your server. ...
    (microsoft.public.sqlserver.olap)