Re: Why does socket:read() take so long to determine broken connection?
From: Eric Sosman (eric.sosman_at_sun.com)
Date: 01/12/05
- Next message: Chris Berg: "Opening a document using exec()"
- Previous message: The Abrasive Sponge: "Re: Webserver & Tomcat"
- In reply to: Oliver Hittmeyer: "Why does socket:read() take so long to determine broken connection?"
- Next in thread: Steve Horsley: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Steve Horsley: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Tom Dyess: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Oliver Hittmeyer: "Re: Why does socket:read() take so long to determine broken connection?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 12 Jan 2005 11:43:45 -0500
Oliver Hittmeyer wrote:
> hello [NG],
>
> Assuming a client/server app: Client and server are communicating by
> sending message objects to each other. When we're going to unplug the
> network adapter server-side, the server-app will detect this
> automatically & is fine out...
>
> not so the client-app: the application (implemented in blocking mode;
> implemented via 2 different threads) is still listening to the socket..
>
> and now there comes the story: when we're going to trigger the client-
> app to send a message object over to the server, the socket
>
> 1. does not throw any IOException when going to write the data
>
> 2. approx. 50sec after the write(), the blocking read() will throw
> a SocketException: Connection reset by peer
>
>
> does anybody know about this? why doesn't there occur an IOException
> when going to write to a socket, which does not have a connected peer?
>
> and why is it, that the read() does take 50sec to determine the socket
> is dead? - and are there any tips/tricks/work-arounds to get this
> "broken network"-detection faster?
The question really isn't about Java, but about TCP/IP
networking. Here's an over-simplified answer; for more
details consult a networking newsgroup or reference book.
TCP simulates a reliable data stream atop the lower-level
IP facility, which amounts to an unreliable transport for
individual "datagrams." It's a little bit like simulating a
telephone conversation by exchanging postcards, some of which
get lost. Naturally, there are conventions for establishing
and tearing down this virtual phone call: There's a prescribed
sequence of postcard exchanges to initiate the call ("I want to
talk to you." "Okay, I'm willing." "Great: let's begin."),
and another prescribed sequence by which both ends coordinate
a sign-off ("I'm finished." "Okay, so am I.") There's also
a system of acknowledgments to help recover when postcards
get lost or defaced in transit.
But if the other participant in your simulated phone call
gets struck by lightning, how do you find out he's no longer
there? The only symptom is that some of your postcards go
unacknowledged -- but that can happen in the ordinary course
of events, since delivery is not reliable: Maybe your postcard
never reached him so he hasn't acknowledged it, or maybe his
acknowledging postcard never made it back to you. All you know
is that some days have gone by and you haven't received mail.
So you send out duplicates of the unacknowledged postcards,
and wait a while longer. Only after several such attempts do
you finally begin to suspect that your correspondent is no
longer "on the phone," even though you and he haven't agreed
to terminate the call. Note that there has been no "hang up"
signal -- he can only signal you by sending a postcard, and if
he can no longer send you postcards he can't tell you he wants
to hang up. Your postcards just vanish without eliciting any
response, and eventually you figure out he's stopped writing.
The 50-second delay is how long it takes before TCP finally
gives up and declares your correspondent unresponsive (there
are standards governing such timeouts, but a lot of systems are
configured with non-standard values; "50 seconds" is by no
means universal).
One further thing: When you've initiated a virtual phone
call but there's a long interval when neither side has anything
to say, how many postcards do you suppose are exchanged in the
period of silence? That's right: none. So if lightning strikes
your correspondent during such a period, you have no indication
at all of the event -- if you're sending no postcards, you won't
notice the lack of acknowledgments. Until you try to tell him
something you won't learn that he's no longer listening.
I hope this analogy helps you interpret what you've observed.
All I ask is that you not try to design an entire system around
an imperfect analogy! If you need more details, there are lots
of references you can look up for the real nuts and bolts.
-- Eric.Sosman@sun.com
- Next message: Chris Berg: "Opening a document using exec()"
- Previous message: The Abrasive Sponge: "Re: Webserver & Tomcat"
- In reply to: Oliver Hittmeyer: "Why does socket:read() take so long to determine broken connection?"
- Next in thread: Steve Horsley: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Steve Horsley: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Tom Dyess: "Re: Why does socket:read() take so long to determine broken connection?"
- Reply: Oliver Hittmeyer: "Re: Why does socket:read() take so long to determine broken connection?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]