Re: Lost data on socket - Can we start over politely?
From: Thomas Kratz (ThomasKratz_at_REMOVEwebCAPS.de)
Date: 03/31/04
- Next message: Sherm Pendley: "Re: Is Perl supposed to use 100% of the CPU?"
- Previous message: Tad McClellan: "Re: count files + dirs"
- In reply to: Vorxion: "Re: Lost data on socket - Can we start over politely?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 31 Mar 2004 17:17:30 +0200
Vorxion wrote:
> In article <406a9005.0@juno.wiesbaden.netsurf.de>, Thomas Kratz wrote:
>
>>>In short, I think my code is simply lagging behind, and when it lags far
>>>enough, the rest of the data vanishes. I'm working on fixing that bit, and
>>>I'll also have to prioritize it so that when data is present, it stops
>>>working on processing its data internally and goes immediately back to
>>>reading from the socket. And I'm simply going to have it scarf up as much
>>>data as is present and basically parse and process when there is no
>>>actual communication going on. That should (hopefully) elminiate the
>>>problem.
>>
>>That was my guess to, but I couldn't confirm it :-). The problem could be,
>>that the TCP buffers of the machine the server runs on, are filled because
>>you are draining them too slow, but the client is sending data anyway. I
>>didn't use IO::Select on the client with a can_write(), which should take
>>care of that ( provided you have control over the client code :-)
>
>
> The question then becomes whether a non-forking model is viable for
> multiple connections if it needs to do processing. I should think so.
That depends ;-) The questions are:
1. Do I need more than SOMAXCONN simultaneous connections?
2. Am I able to get the algorithm right for looping over the client
sockets without getting too busy on one of them
Forking (and even threading) is surely easier to handle. Blocking will
bite you fatally when jumping between the connection in one process/thread.
> I
> think it's mostly the expense of doing a sysread() of -one- character
> about 18 times, and -then- getting bigger blocks, but still not big enough
> to make it keep pace--they were only as big as the packets, which were
> generally 110chars max.
>
> At which point I probably had far more overhead than it could tolerate when
> the client was spewing things forth without delay, even though it was
> writing a packet/line at a time. :)
>
> Boy, I never considered the client-side. You're basically saying that the
> can_write will actually get the socket equivalent of flow control from the
> remote end and only send when it won't get lost?
Exactly. It means that your client side send buffer has free space to
queue the data. Actually it only guarantees you can safely write 1 byte.
I just tested it with a slightly modified client:
use strict;
use warnings;
use IO::Socket;
use IO::Select;
$| = 1;
my $sock = IO::Socket::INET->new(
PeerAddr => '235p008',
PeerPort => 4016,
Proto => 'tcp',
) or die "couldn't create socket";
my $sel = IO::Select->new();
$sel->add($sock);
my $i = 0;
while ( $sock->connected() ) {
if ( $sel->can_write(0.05) ) {
last unless print $sock $i, 'x' x (100-length($i++)), "\n";
} else {
print "cannot write\n";
sleep(1); # todo: sleep less with Time::HiRes
}
}
works like a charm ;-) I throttled the server with a sleep at the end of the
foreach my $sock ( $sel->can_read(0.05) ) {
....
loop.
>
>
>>>Well, mine is multiplexing--it's meant to take in more than one connection
>>>at a time. Yours was written with a single connection in mind, although it
>>>demonstrates perfectly that buffering shouldn't be an issue, even at 1k
>>>block sizes. The multiplexing is what's complicated mine so much. :)
>>
>>No. Have you tried it? It handles exactly SOMAXCONN connections.
>>The only cause for using the $max_buf variable is preventing one client
>
>>from flooding the server with data and neglecting the other clients.
>
>>If the client data dripples in slowly, one could also use a $max_time
>>value (with Time:HiRes) for the maximum time the server processes one
>>socket at a time.
>
>
> Ah, that explains the limitation. Good idea, actually, and I think I'll
> keep that then. Actually, do you know if the buffer on the socket is one
> single pool, or if each connection to the port has its own buffer? That
> goes baack to my other statement about not wanting one attacker to DoS the
> whole socket by flooding one fd.
Would be a bloody useless concept otherwise, wouldn't it? ;-) Each
connection has it's own buffers on both ends.
Do you worry about your own clients behaving badly due to user input, or
someone else connecting to the server flooding it with data?
The latter can be easily avoided by implementing identification of the
client or proper authorization of an user. Else drop the connection.
>
>
>>The foreach loop looks for all readable sockets and handles them. The only
>>thing left out is handling a specific timeout for connected client sockets
>>without incoming data (could be easily done via a lookup hash with the
>>stringified socket values as keys and the time value of the last action on
>>the socket as values, shouldn't be more than a few lines).
>
>
> Yes, I caught the fact that it would take multiple connections. I even
> tested it. However, the logging was not multiplexed, where all my protocol
> states must be. You weren't differentiating in the log between fd's, but
> it was really the proof that the buffering was fine that mattered to me.
> The rest is no problem. That was what had me really worried.
Yeah, I didn't care about the logging, left that for you ;-)
>
> I've got it halfway rolled into my code. I separated out the flow into two
> loops--one will go as fast and hard as it can to read data (I'm going to
> implement the max_buf now that I know why you had it), and the other is for
> processing the input, and at every possible pausing point it checks to see
> if there is more data to be read and will go back to reading as immediately
> as possible if there is. I "just" need to roll in the code that breaks up
> the packets from this large internal buffer where I'm storing huge hunks of
> code that aren't even analysed. Once I have that, it should hopefully be
> fine.
>
> Chances are, nothing in the intended use would stress it anywhere near what
> I have been putting it through. Then again, I don't like to take chances.
Be careful not to reinvent the wheel. This separation into two loops and
the "large internal buffer" seem not neccesary to me. Just limit the
socket processing to a maximal size per check (like I did with $max_buf)
and a maximum time (i.e. 1/10th of a second) and process it. You have to
do this anyway. Storing the data elsewhere will not take less time ;-)
Especially since the client can stop sending data, if the server should
not be able to process it.
An exception to that could be, that you need a lot of data in one piece,
to be able to process it.
>
> If I think back to how lousy NFS performance is if you have it set to less
> than 8192 byte packets, I should have realised exactly what was going on,
> probably. I just didn't think those 15 single-byte reads per packet were
> that expensive. And then the small packets on top of it. Ugh. That
> pretty much explains it all.
>
> I thank you SO much for your assistance, Thomas! You have no idea how much
> relief I feel at this point. I did a brief test of your read methodology
> rolled partly into the small sample I had here and it was working 100%
> and consistantly. I'm rolling into the real thing now, which is a wee bit
> more complicated.
Looking at your code, I have the impression that you made it more
complicated than it should be.
If you could specify what exactly your application should do, I'll perhaps
be able to give you a few design suggestions.
And if you want to concentrate on functionality have a look at the POE
framework (http://poe.perl.org).
Thomas
--
open STDIN,"<&DATA";$=+=14;$%=50;while($_=(seek( #J~.> a>n~>>e~.......>r.
STDIN,$:*$=+$,+$%,0),getc)){/\./&&last;/\w| /&&( #.u.t.^..oP..r.>h>a~.e..
print,$_=$~);/~/&&++$:;/\^/&&--$:;/>/&&++$,;/</ #.>s^~h<t< ..~. ...c.^..
&&--$,;$:%=4;$,%=23;$~=$_;++$i==1?++$,:_;}__END__#....>>e>r^..>l^...>k^..
- Next message: Sherm Pendley: "Re: Is Perl supposed to use 100% of the CPU?"
- Previous message: Tad McClellan: "Re: count files + dirs"
- In reply to: Vorxion: "Re: Lost data on socket - Can we start over politely?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|