Re: Time outs and state machines
- From: "H. S. Lahman" <h.lahman@xxxxxxxxxxx>
- Date: Tue, 28 Feb 2006 18:56:55 GMT
Responding to Sanford...
Others have discussed the basic issues. I'm just providing a bit more detail on how one might actually implement things.
There is a scenario that's probably rare but it's bothering me. Suppose the receiving device sends an ACK message just before the timer elapsed event. The ACK message gets put on the queue and immediately after the timer event places its message on the queue as well. The state machine consumes the ACK message. The next message in the queue is the timer event. When the state machine consumes it, it believes that the 20 milliseconds for the last message have elapsed when in fact they have not.
Welcome to the Wonderful World of Race Conditions in R-T/E. B-) Your setup with a timer event is pretty standard, so this is a common problem.
I'm not sure what I'm missing, but I'd like my code to me more robust than this. Is there an additional state that would prevent the above scenerio? I was thinking about giving each timer event a unique ID as one approach. When the state machine consumes a timer event, the ID must be valid for the next message or it is ignored.
It really depends on the context. The Big Question is: Do you have to do anything special if the device does its thing late? That is, do you have to undo something or avoid sending a repeat (e.g., via a different channel) if the device ACKs after the timeout? (I'll get to that at the end after I walk the simple situation.)
Hopefully the answer is No, which is why one likes re-entrant devices. In that case, at a simplistic level you just need to set up your STT to ignore the ACK or the Timer event when it does come in. For example, in a simplistic case we might have:
E1:ACK
[Waiting Patiently] ----------------> [Processed ACK]
|
| E2:Timeout
|
V
[Processed Timeout]
The STT might look like:
| E1:ACK | E2:Timeout
-------------------+----------------+-------------------
Waiting Patiently | Processed ACK | Processed Timeout
-------------------+----------------+-------------------
Processed ACK | Can't happen | Ignore
-------------------+----------------+-------------------
Processed Timeout | Ignore | Can't Happen
-------------------+----------------+-------------------
Alas, life is rarely simple. The problem lies in states that are entered after [Process ACK] and [Processed Timeout]. If the E1 or E2 event is delayed a lot, this state machine could have migrated through several states. You can still ignore the events on all those states but that get tedious. It also gets dangerous because one could get back to [Waiting Patiently] for another request before the delayed event is consumed, as Daniel T. mentioned. Then one is processing the wrong ACK or timeout.
That's not a problem for the Timer because [Processed ACK] would turn it off (assuming no significant delays). One would then only have a problem if there was concurrent processing putting events on the queue while [Processed ACK]'s action was executing AND those events pushed this state machine through its states back to [Waiting Patiently]. However, that would be Big Problem anyway because this machine wouldn't be in the right state to process those events for later in the life cycle (i.e., it could still be sitting in [Waiting Patiently] rather than [Processed ACK] or [Processed Timeout]). Thus this state machine's interactions should be constructed so that nobody can put an event on the queue until this machine is ready to accept it. IOW, all state machine interactions are done via hand shaking and daisey-chaining event generation.
It is potentially a problem for E1, though, in a distributed system because the device isn't being turned off immediately. That means you could be back in [Waiting Patiently] after the timeout was processed and a second device request was made when the ACK for the first request finally gets consumed.
[You could have [Processed Timeout] send a shutoff message to the device, but that might be delayed a long time as well. You could also have [Processed Timeout] expunge the event queue, but that won't help if the delayed ACK is enqueued late in this object's life cycle. (Mucking with the event queue is also a very risky practice.)]
So what you need to do is have some way to tell one ACK from another. As others suggested, you probably need some additional identity for the event, such as a timestamp. Then [Processed ACK] can test the identity and ignore it _in its action_ (rather than in the STT). Then E1 is also a reflexive event for [Processed ACK] as it waits for the right one and the STT changes to:
E1:ACK
[Waiting Patiently] ----------------> [Processed ACK] <----+
| | |
| E2:Timeout | |
| +--------------+
V E1:ACK
[Processed Timeout]
and
| E1:ACK | E2:Timeout
-------------------+----------------+-------------------
Waiting Patiently | Processed ACK | Processed Timeout
-------------------+----------------+-------------------
Processed ACK | Processed ACK | Ignore
-------------------+----------------+-------------------
Processed Timeout | Ignore | Can't Happen
-------------------+----------------+-------------------
Note that this all assumes that you don't have to worry about the device side of things if an ACK comes in after a Timer timeout is processed. If you do need to worry about the device side of things, then your state machine will get more complicated -- so much so that I would consider delegating that processing to another object. That is, if [Processed ACK] encounters a lagged E1, it would send an event to the delegatee object to deal with that situation while it continues to wait for the ACK that this object /now/ cares about.
*************
There is nothing wrong with me that could
not be cured by a capful of Drano.
H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH
.
- References:
- Time outs and state machines
- From: Leslie Sanford
- Time outs and state machines
- Prev by Date: Re: OOP can be simply summed up as 'passing messages to objects'
- Next by Date: Re: OOP can be simply summed up as 'passing messages to objects'
- Previous by thread: Re: Time outs and state machines
- Index(es):
Relevant Pages
|