Re: Observer pattern limitations
- From: "David Barrett-Lennard" <davidbl@xxxxxxxxxxxx>
- Date: 12 Jul 2006 03:01:48 -0700
Dmitry A. Kazakov wrote:
On 11 Jul 2006 06:41:49 -0700, David Barrett-Lennard wrote:
Dmitry A. Kazakov wrote:
On 10 Jul 2006 07:24:12 -0700, David Barrett-Lennard wrote:
Dmitry A. Kazakov wrote:The general approach for replicas is time stamping. Unfortunately it does
not help you in reconstruction of a coherent system state. You'll also need
a history window. But even this does not give you an ability to reconstruct
the state for any given time point. Also it becomes a nightmare for
bidirectional connections.
Time stamping can be quite inefficient. For example a dependent in the
dependency graph may depend (indirectly) on thousands of inputs, and a
complete recurse of the graph is needed just to tell whether it is
dirty (ie to perform all the time stamp comparisons). Within a
process, propagation of dirtiness through a dependency graph seems a
better choice.
These are independent issues. You refer to a notification propagation. Time
stamping is for state reconstruction. Surely you could use it to
reconstruct state of "dirtiness", if you wanted.
Ok. I have seen designs use a "change count" which is essentially
a logical time stamp. This has formed the basis for guiding dependency
graph recalculation in a similar manner to the "make" tools for
compiling and linking software. When you spoke about using timestamps
I was thinking about that approach.
Note also that when you have a circular dependency all caches along this
path are dirty at *any* given moment of time.
Yes. I regard all caches along the circular path as being
"undefined". In most programs a cycle represents a design error.
I would only allow it (and therefore be forced to detect it to avoid
stack overflow) if the end-user is able to create cycles, such as in an
application like Excel.
As for graph closures, I don't think it is a good idea to evaluate them
each time, if you have a deep graph.
Huh? Are you talking about transitive closure of the graph?
Probably you should encapsulate
notification service and cache closures in there. In a language for
programming virtual channels (data-flow) I cached dependencies. It was easy
to do during compilation of channel descriptions.
But all this does not solve the problem 1, routing notifications is an
independent issue.
IMO, the lesson to learn: go higher-level, use ADTs with behaviors
*specific* to the problem. There is no general solution at low level.
You'll need to elaborate please!
There is always some a-priori knowledge about publishers and subscribers.
Each piece of information is useful. If I know that two variables are
published synchronously I place them into a container variable and publish
the container. For example, a controller might have 20-30 variables with
rates under 1ms. That would break the neck of any publisher-subscriber
system, if it won't take into account that they are synchronous.
I somewhat agree. The dependency graph should only be as fine grained
as it needs to be. There is a tradeoff. As we allow a single node to
represent more state we have the advantage of a simpler graph, and
reduce the overheads of adding or removing edges, and the number of
notification messages. However the downside is that values may be
calculated unnecessarily. Also, we may introduce a cycle.
I agree with you that the abstractions in the design will help provide
a natural and appropriate level of granularity for the dependency
graph.
You can't always avoid the need for caching, and the need to know when
caches are dirty. Furthermore, some problems have complex dependency
graphs that can't be abstracted away.
I pack them in a tightly coupled object like a record variable above. That
would cut all dependencies.
I have been building software for the last few years that allows the
end user to assemble complex persistent components like leggo, and use
formulae in various ways to wire them up in interesting ways. In that
domain I have to deal with complex dependency graphs that form at run
time and can't be abstracted away.
PROBLEM 3: Observer attaches or detaches during a notification
It is racing condition, when allowed to happen asynchronously.
No I'm talking about a single thread that is in the process of
iterating through the observers of a subject in order to notify them,
and an observer decides that it wants to synchronously detach (say).
The problem doesn't exist if the detach is performed asynchronously.
The rule of thumb for good OO design is that objects aren't subjects of
their own. So I would say that objects detaching themselves is a bad
design. However, technically it is no problem.
Please elaborate.
It is an empiric principle, however maybe it could be supported by
analogies in set theory. My observation is whenever you try to make objects
which, for example, delete themselves, you end up with a mess. Same with
detaching. You want to be "functional" using "forall x in X P(x)" pattern,
and at the same time you break it by changing the quantifier (the set X)
from P. It is fundamentally inconsistent, so the mess.
I now look at the problem of cache management across the program from a
global, graph theoretic perspective, because a local view as suggested
by the observer pattern doesn't seem to work well in practice.
This reminds me of the problem of detecting garbage which is (often)
nicely handled by a global tracing GC, whereas local approaches such as
reference counting are difficult and painful to get right, and often at
odds with writing reusable components. I agree that "real time"
issues remain a problem with graph theoretic approaches.
IMO the problem of when to attach/detach is conceptually dictated by
the result of recalculating a cache. As this happens the inputs are
accessed. These form the new set of edges to that node in the
dependency graph. The previous set of edges can safely be removed.
If notification only propagates dirtiness eagerly through the
dependency graph then we eliminate all the problems I have discussed.
This includes the question of when to update the dependency graph
topology - because that happens in a separate phase when caches are
lazily evaluated.
The problem is that while a subject is notifying one of the observers,
an observer decides to attach or detach to/from the subject. This may
cause a crash depending on the data structure used by the subject to
store the observers.
This is not a problem, it is a bug. List and its iterators are objects. So
it is legal to modify the list while iterating it.
Double linked lists (std::list) will work, std::vector or std::deque
will not.
No. It is no matter how the list is implemented. If the contract reads:
iterators stay valid upon list modification, then, well, it is the
contract.
...which is the case for std::list, but not for std::vector and
std::deque.
For what it is worth, I am not a fan of STL design... (It is bad OO,
largely because Stepanov does not believe in it ... but does. (:-))
The very idea to change the list of observers while notifying them is
horrible to me. For example, does it matter whether the new observers
being attached also receive the notification?
Yes it does. In my implementation of persistent objects the weak reference
notification callback passes a set of objects, into which you can place new
objects. They will be notified as well. The set is held by the caller.
Sounds messy to me :)
It is much cleaner than to expose the list and iterators. How often new
objects are created upon notification? In fact, it is just a way to keep it
"functional", by replacing
forall x in X modify X
with
forall x in X modify Y; forall y in Y modify Z ...
Yes it's cleaner that to expose the list and iterators.
However, I'm comparing it to the idea of only propagating dirtiness
during state change notifications. That eliminates the problem
completely.
PROBLEM 4: Memory management problems
There seems to only be one general solution (that I can think of). A
subject needs to hold strong references on its observers.
Well, usually it is a weak reference observer->subject. Clearly each weak
reference is a strong one in the backward direction. But implementations
don't care about it, because finalization of an observer finalizes the
reference with it. So there is nothing to care about.
Your typical C++ implementation of the observer pattern won't be able
to solve the memory management problem in the example I gave.
It does (and it is not C++, though it were also possible in C++ albeit
through much more sweat and tears). It uses handles.
I don't see how. Please elaborate
Handle = smart pointer.
No, that won't solve the problem when you have pre-written components
in a third party library and you need to add the objects that represent
the "wiring". What deletes the wiring?
Either subjects need to store strong references to the observers, or
else observers need a special notification for when the subject is
being destroyed. Neither of these approaches is done in typical
implementations of the observer pattern.
You have to do a closure of all publisher-subscriber relations. Its result
will be a new observation relation. In general, observations are not
composable. Consider:
c ::= a + b
Let a and b change synchronously, then asynchronous notification
propagation would produce rubbish in c due to race condition.
I don't know what you mean. If a,b both change and they both mark c
as dirty, then later when c is requested it is recalculated and will
simply use the new values of a,b, and therefore give the correct
result. Where is the race condition?
Race condition is in c. When notification about a arrives before one of b,
c thinks that b's cache is OK, which is wrong. Note that even when c does
not cache anything, there is always some hidden cache present. When c reads
a, somebody else might be busy with changing a or b. You must lock
something, somewhere to kill transition processes.
This is a multi-threading issue. I was only addressing the problem of
cache management for a single thread.
I think I have more of a functional programming perspective on this
than you. I try to promote the idea that the order of calculation is
driven naturally by functional dependencies. This is a powerful
approach to help write code that is both correct and very efficient.
Hmm, that's the observer pattern is all about: triggering things =
determining the order.
I'm talking about examples like "c := a.get() + b.get()". The
act of executing that code causes a and b to be calculated *before* c
is calculated. Lazy evaluation is great for solving most order of
calculation problems.
By contrast, most spreadsheets use an algorithm called "topological
sort". This involves sorting all the nodes to be calculated based on
the partial ordering defined by the dependency edges.
The problem is that there might be no order compatible with the semantics.
To determine this or to find a right order is a very difficult task. Note
that more or less elaborated observation primitives (whatever they be) are
not composable. I.e. you cannot keep the invariant "the order is correct"
per construction. One way is to limit them to a composable subset, this is
what SPARK and Ravenscar profiles do. Unfortunately, the subsets obtained
are quite weak. The framework must be enough powerful to implement, say, a
network layer, but weak to create two waitable mutexes. It is quite
challenging...
I suspect we are solving different problems!
I'm not sure to what extent we have a basis for comparison because you
seem to be interested in a fully asynchronous and perhaps distributed
system, whereas I'm talking specifically about a single process and
avoiding multi-threading issues.
In that setting there is nothing simpler that the idea to eagerly
propagate dirtiness through the dependency graph, and avoid all the
problems I have discussed in this thread. To me, the only quandary is
to somehow avoid unnecessary work when intermediate nodes end up having
the same value after recalculation. In the problems I have dealt with
this happens more often than not.
This way of thinking seems simple and elegant.
It could turn even worse. You can play a little with data flow languages
like MatLab/Simulink, DiaDem which basically postpone everything to the
next iteration step. It is the idea taken to an extreme. The major problem
with it is that is extremely slow [in order of magnitudes] as compared with
an asynchronous design. Another problem is that being so slow it severely
suffers from oversampling. You will deal with phantoms, state changes which
are not. The example with differential is an oversampling problem.
Cheers,
David Barrett-Lennard
.
- Follow-Ups:
- Re: Observer pattern limitations
- From: Leslie Sanford
- Re: Observer pattern limitations
- From: Dmitry A. Kazakov
- Re: Observer pattern limitations
- References:
- Observer pattern limitations
- From: David Barrett-Lennard
- Re: Observer pattern limitations
- From: Dmitry A. Kazakov
- Re: Observer pattern limitations
- From: David Barrett-Lennard
- Re: Observer pattern limitations
- From: Dmitry A. Kazakov
- Re: Observer pattern limitations
- From: David Barrett-Lennard
- Re: Observer pattern limitations
- From: Dmitry A. Kazakov
- Observer pattern limitations
- Prev by Date: Re: Observer pattern limitations
- Next by Date: Re: Observer pattern limitations
- Previous by thread: Re: Observer pattern limitations
- Next by thread: Re: Observer pattern limitations
- Index(es):