Re: Problem with writing fast UDP server



On Nov 20, 9:03 am, Krzysztof Retel <Krzysztof.Re...@xxxxxxxxxxxxxx>
wrote:
Hi guys,

I am struggling writing fast UDP server. It has to handle around 10000
UDP packets per second. I started building that with non blocking
socket and threads. Unfortunately my approach does not work at all.
I wrote a simple case test: client and server. The client sends 2200
packets within 0.137447118759 secs. The tcpdump received 2189 packets,
which is not bad at all.
But the server only handles 700 -- 870 packets, when it is non-
blocking, and only 670 – 700 received with blocking sockets.
The client and the server are working within the same local network
and tcpdump shows pretty correct amount of packets received.

I included a bit of the code of the UDP server.

class PacketReceive(threading.Thread):
    def __init__(self, tname, socket, queue):
        self._tname = tname
        self._socket = socket
        self._queue = queue
        threading.Thread.__init__(self, name=self._tname)

    def run(self):
        print 'Started thread: ', self.getName()
        cnt = 1
        cnt_msgs = 0
        while True:
            try:
                data = self._socket.recv(512)
                msg = data
                cnt_msgs += 1
                total += 1
                # self._queue.put(msg)
                print  'thread: %s, cnt_msgs: %d' % (self.getName(),
cnt_msgs)
            except:
                pass

I was also using Queue, but this didn't help neither.
Any idea what I am doing wrong?

I was reading that Python socket modules was causing some delays with
TCP server. They recomended to set up  socket option for nondelays:
"sock.setsockopt(SOL_TCP, TCP_NODELAY, 1) ". I couldn't find any
similar option for UDP type sockets.
Is there anything I have to change in socket options to make it
working faster?
Why the server can't process all incomming packets? Is there a bug in
the socket layer? btw. I am using Python 2.5 on Ubuntu 8.10.

Cheers
K

First and foremost, you are not being realistic here. Attempting to
squeeze 10,000 packets per second out of 10Mb/s (assumed) Ethernet is
not realistic. The maximum theoretical limit is 14,880 frames per
second, and that assumes each frame is only 84 bytes per frame, making
it useless for data transport. Using your numbers, each frame requires
(90B + 84B) 174B, which works out to be a theoretical maximum of ~7200
frames per second. These are obviously some rough numbers but I
believe you get the point. It's late here, so I'll double check my
numbers tomorrow.

In your case, you would not want to use TCP_NODELAY, even if you were
to use TCP, as it would actually limit your throughput. UDP does not
have such an option because each datagram is an ethernet frame - which
is not true for TCP as TCP is a stream. In this case, use of TCP may
significantly reduce the number of frames required for transport -
assuming TCP_NODELAY is NOT used. If you want to increase your
throughput, use larger datagrams. If you are on a reliable connection,
which we can safely assume since you are currently using UDP, use of
TCP without the use of TCP_NODELAY may yield better performance
because of its buffering strategy.

Assuming you are using 10Mb ethernet, you are nearing its frame-
saturation limits. If you are using 100Mb ethernet, you'll obviously
have a lot more elbow room but not nearly as much as one would hope
because 100Mb is only possible when frames which are completely
filled. It's been a while since I last looked at 100Mb numbers, but
it's not likely most people will see numbers near its theoretical
limits simply because that number has so many caveats associated with
it - and small frames are its nemesis. Since you are using very small
datagrams, you are wasting a lot of potential throughput. And if you
have other computers on your network, the situation is made yet more
difficult. Additionally, many switches and/or routes also have
bandwidth limits which may or may not pose a wall for your
application. And to make matters worse, you are allocating lots of
buffers (4K) to send/receive 90 bytes of data, creating yet more work
for your computer.

Options to try:
See how TCP measures up for you
Attempt to place multiple data objects within a single datagram,
thereby optimizing available ethernet bandwidth
You didn't say if you are CPU-bound, but you are creating a tuple and
appending it to a list on every datagram. You may find allocating
smaller buffers and optimizing your history accounting may help if
you're CPU-bound.
Don't forget, localhost does not suffer from frame limits - it's
basically testing your memory/bus speed
If this is for local use only, considering using a different IPC
mechanism - unix domain sockets or memory mapped files
.



Relevant Pages

  • Re: Socket switch delay
    ... The server uses blocking sockets just because I am also using Overlapped IO ... structures to send the packets. ... The reason I am using a 0 bytes send buffer in my socket (i.e. ...
    (microsoft.public.win32.programmer.networks)
  • Re: Socket switch delay
    ... > I was referring to the client sending data on this socket while the server ... sockets to provide TCP support without also supporting asynchronous ... > The server uses blocking sockets just because I am also using Overlapped ... > structures to send the packets. ...
    (microsoft.public.win32.programmer.networks)
  • Re: Problem with writing fast UDP server
    ... UDP packets per second. ... socket and threads. ... I wrote a simple case test: client and server. ... I included a bit of the code of the UDP server. ...
    (comp.lang.python)
  • Re: CAsyncSocket thread crashing on WM_SOCKET_NOTIFY message
    ... > getting a message after the socket has closed. ... > packets for the connection/socket that ends in an exception. ... The server is listening to port 33000. ... > Close Connection ...
    (microsoft.public.vc.mfc)
  • Preserving DOM context between pages when drilling down
    ... frame that gives all frames access to a socket so as to retrieve information ... just got back from the server, so would it make more sense to ask the ... browser to display the background html page and then use the ...
    (comp.lang.javascript)