Re: [Python-Help] Reading from socket file handle took too long




Hi,

thanks Matthew for the answer. It help me think a little bit more.

I found a solution and I think I found the culprit but I would need to do more investigation to be sure.

My solution is to work directly with the socket library instead of using the higher level urllib2 library. Now it take only around 0,005 s. to read and process each chunk of data (a chunk is the data that arrived in the last 1/24 s.) instead of between 0,04 s. and 0,1 s.. That's a lot faster!

I think the real culprit for slowing this process is in the httplib that urllib2 is using, and the problem come probably from the way httplib read from the socket. It look httplib have some problem reading when there's no EOF or something like that.

Etienne

On 06-02-28, at 12:58, Matthew Dixon Cowles wrote:

Dear Etienne,

Hi,

Hi!

I monitor every call and I found that the culprit is when I read
from the socket file handle. It's the only bottleneck in my code. I
try different approaches to the problem but I always hit the same
problem.

I don't think it's normal that read() take more then 0,04 sec to
read 1000 bytes from memory (the socket file handle). How can I
avoid this problem ?

temp = self.fh.read(self.limit) # <-- TAKE A LOT OF TIME

I agree with you that reading 1000 bytes from a socket shouldn't take
that long. Especially since it should really amount to copying the
data from one buffer to another.

It's just a guess, but I would suspect that the reason that it takes
that long to read a particular amount of data from the socket is that
there isn't that much data available when you start the read.

It ought to be easy enough to check that by reducing the amount you
try to read by a lot and seeing if the call finishes a lot faster.

If that turns out to be what's slowing you down, it might be useful
to separate the job of reading from the job of decoding and doing
whatever else you need to do with the data. To me, the obvious way to
try to do that would be with threads. I've never used Twisted, but I
imagine that it provides some mechanism for something like that.

I hope that other folks here will also answer if they have a guess
about what's happening.

Regards,
Matt


.



Relevant Pages

  • Re: Data sent - flushing remainder
    ... thought i could append the data size to be expected to the first 4 bytes. ... Are you saying i should receive until i get the full amount of data being ... And as there is no end of data character and i do not close ... the socket i have no way of telling it when to stop. ...
    (microsoft.public.win32.programmer.networks)
  • Re: Socket recv() question?
    ... There's a limited amount of RAM in CE devices. ... Configure tx timeout ... increasing the timeout using the select method on the socket descriptor. ... sockets to send binary data from a pc to Pocket PC. ...
    (microsoft.public.windowsce.app.development)
  • FIN_WAIT1
    ... At first I thought it was some DNS problem.. ... some memory problem because I saw my amount of available memory dropping ... After increasing the amount of swap memory ... and socket handling shouldn't be a problem I guess. ...
    (comp.os.linux.networking)
  • Re: Advice on writing server daemon
    ... > new instance of the program, passing it the socket. ... > socket itself, forking for each connection. ... make any unix program, ... creates thousands of short-lived processes in a short amount of time. ...
    (comp.unix.programmer)
  • Re: SSL (HTTPS) with 2.4
    ... I try pyopenssl and can get a secure socket to the server, ... unsure how to use this socket with urllib2 or even httplib. ... def run(server, proxy): ...
    (comp.lang.python)