Re: Read everything from socket




Aaron Dougherty wrote:
My pleasure. Unfortunately you're running into one of the limitations
of TCP/IP. There is no standard way to indicate a stream is finished
short of closing the connection (which Perl will handle properly).

Because of this protocols typically have ways to detect it yourself.
What I would suggest in your example is to parse the first response
line from the RETR command

+OK 1234 octets

Or from the LIST command

+OK 2 messages (1801)
1 1234
2 567

In both cases, the actual byte length is given of the response steam.

There are two ways to read a specific byte amount, you can either use
the read function
http://perldoc.perl.org/functions/read.html

This is probably the best method.

if ($retr=~/^\+OK\s(\d+)\soctets$){
my $total_size = $1;
my $bytes_read = 0;
my $response = '';
while ($bytes_read < $total_size){
$socket->read($data, 1024);
$response.=$data;
$bytes_read+=length($data);
}
}

The other way you could do it is with the same readline command, and
use length like in the previous example to count how many bytes you
have read in so far. Once you get the expected response length, you
know the stream is complete.

Hope that helps :)

-Aaron

aleko.petkov@xxxxxxxxx wrote:
Aaron Dougherty wrote:
The problem is that getline is a blocking call, meaning it will sit
around and hold up the program (or fork or thread) until it has a line
to get, which means $line will always have a file, and the while will
never exit.

I'm not intimately familiar with the POP3 protocol, but if I'm not
mistaken, there are various ways with it to know when you have reached
the end of a stream. For example: The first line of a list command
indicates how many lines it will return, so you can break out of the
while when you know you got the last line, and the retr command uses a
period on a line by it's self to indicate the end of a line. So
something such as the following should work for you

# Reading from RETR
while (my $line = $socket->getline){
last if $line=~/^\.$/;
$msg.=$line;
}

# Reading from LIST
if ($socket->getline =~ /^\+OK\s(\d+) messages/){
$message_count = $1;
for (1..$message_count){
$msg = $socket->getline;
}
}

aleko.petkov@xxxxxxxxx wrote:
Hi,

I want to read a bunch of lines from a socket. The problem is that the
getline method doesn't seem to detect the end of the stream; the loop
never returns. Here's the code I'm using:

my $line = ''
while( $line = $socket->getline )
{
$msg .= $line;
}

This reads the response from a POP3 server. If I loop a fixed number of
times, everything works, so the problem is detecting the EOF.

Any ideas?

Thanks,

Aleko

Thanks Aaron,

That's basically the approach taken by the module I'm trying to work
with (Mail::Pop3Client); it reads until it hits a specific delimiter.

The problem I'm running into is that it doesn't handle malformed
messages properly, e.g. messages with '\n.\n' embedded. The script
stops parsing the body prematurely, and subsequent POP commands fail or
get the remainder of the body in response.

If I could just get the whole RETR response in a string I could parse
that easily. I just don't know how to detect the on of the response
stream.

Perfect! I ended up using method #1, but I had to reduce the buffer
size to 1 because with anything greater the socket just blocks after a
few reads, waiting for data. This probably slows things down, but not
noticably, so I'll live with it for now.

Also, I had to add an extra pair of of getline()'s after the loop to
take care of the NL.NL , which the POP3 server appends to the end of
the message.

So, here's my addition to the Pop3Client module, that safely retrieves
raw messages, even if they are malformed:

sub RetrieveRaw
{
my $me = shift;
my $num = shift;
my $msg = '';

$me->_checkstate('TRANSACTION', 'RETR') or return;
$me->_sockprint( "RETR $num", $me->EOL );
my $line = $me->_sockread();
unless (defined $line) {
$me->Message("Socket read failed for RETR");
return;
}

# read response & figure out how much data to expect
#chomp $line;

$line =~ /^\+OK\s(\d+)\soctets/ or $me->Message("Bad return from RETR:
$line") and return;
my $total_bytes = $1;
my $bytes_read = 0;
my $data = '';

# read message text
while( $bytes_read < $total_bytes )
{
$me->Socket()->read( $data, 1 ); # TODO: use a larger buffer size
(but without blocking)
$msg .= $data;
$bytes_read += length($data);
}

# now consume EOF marker (\n.\n)
$me->Socket()->getline;
$me->Socket()->getline;

return $msg;

}

This fetches the entire message, including headers, body, and
attachments. You can then split it using something like this:

my $divider_pos = index($headandbody, "\r\n\r\n");
if ($divider_pos == 0)
{
$divider_pos = index($headandbody, "\n\n");
}

if ($divider_pos > 0)
{
$head = substr($headandbody, 0, $divider_pos);
$body = substr($headandbody, $divider_pos+4);
}

Hope this benefits someone :)

Thanks again, Aaron.

-Aleko

.



Relevant Pages

  • Re: Read everything from socket
    ... There is no standard way to indicate a stream is finished ... line from the RETR command ... Or from the LIST command ... the actual byte length is given of the response steam. ...
    (perl.beginners)
  • Re: Read everything from socket
    ... the end of a stream. ... The first line of a list command ... # Reading from RETR ... This reads the response from a POP3 server. ...
    (perl.beginners)
  • redirecting STDIN/STDOUT
    ... I have some problem with redirecting input and output from a process. ... I have to use the Cisco Network Registrar command ... then get the response from the output stream. ...
    (microsoft.public.dotnet.languages.vb)
  • Redirecting STDIN and STDOUT
    ... I have some problem with redirecting input and output from a process. ... I have to use the Cisco Network Registrar command ... then get the response from the output stream. ...
    (microsoft.public.dotnet.general)
  • you toss mild rains in addition to the hon disastrous laboratory, whilst Saeed again injures them to
    ... guarantee Ramsi's yacht in response to outlets, ... command the guilt. ... Hardly any female foods counter ... Mohammar precisely confines a clinical folly next to Eliza's ...
    (sci.crypt)