Observer pattern limitations



The observer pattern is fundamental to OO, because it is so common for
objects to need to cache the results of calculations. Display pixels
on the screen essentially represent a cache, and display redrawing is a
common use case for the observer pattern.

In general a cache can depend on a number of inputs, possibly in
different objects, and the cache needs to be notified when the inputs
change. There are two basic approaches - either to recalculate the
cache immediately, or merely mark the cache as dirty.

Over the last few years I have come across a number of pitfalls with
the observer pattern. These problems don't appear to be well known.
The general attitude seems to be that it's possible to simply wire up
objects in peer to peer fashion and they will work as intended, or you
test and when you find a bug you think of some workaround. This
reflects a sorry lack of formalism in the OO community.

These problems relate to program correctness, rather than concerns
about the open/closed principle, reusability, modularity etc.
Unfortunately correctness has tended to play a back seat role in the
numerous discussion of the Observer Pattern in newgroups or the
Literature.

These problems are insidious, because code that looks fine actually
hides errors, and more to the point, the errors only reveal themselves
as object assemblies and interactions become more complex.

Things get even more complex when you throw persistence or
multithreading into the mix. An important question that I won't
comment on here is whether subject-observer relationships should
persist in an OODBMS. Note that if they do, subjects will fault all
dependents into memory - even dependents that aren't themselves being
observed.

PROBLEM 1: Infinite recursion in notifications.

This occurs when there is a cycle in the dependency graph. Now
technically cycles should be impossible, and that is indeed the case if
data dependency is represented at a fine-grained enough level. However
in practice it is common for objects to host multiple caches, so that
there may be cycles at the course resolution of objects.

PROBLEM 2: Dirty reads.

Let's write X --> Y to indicate that object Y has attached as an
observer to subject X. The arrow is meant to indicate the direction of
data flow (ie dependency).

Consider that we have a redundant edge in the dependency graph, such as

X --> Y
Y --> Z
X --> Z

Let X be changed and X notifies Z before Y. Then Z may access a dirty
cached value in Y, because Y hasn't yet been marked as dirty by X.

PROBLEM 3: Observer attaches or detaches during a notification

The problem is that while a subject is notifying one of the observers,
an observer decides to attach or detach to/from the subject. This may
cause a crash depending on the data structure used by the subject to
store the observers.

One hack I've seen to solve this problem is to use a "one shot
observer". Another is to make a copy of the list of observers before
issuing the notifications.

PROBLEM 4: Memory management problems

This affects languages like C++ with no automatic garbage collection.
It also affects ref counting systems like COM.

Consider a programmer tool kit that provides various concrete classes
like Room, CDPlayer and Button that support assembly. The CDPlayer
and Button can be added to the Room, making them visible in the GUI.
There is a need to make the button control the CDPlayer. This simply
involves writing a "wiring" class that implements IButtonListener,
responds to the OnPressed() message, and calls Play() on its associated
CDPlayer.

The problem is: What deletes the object used for wiring purposes?

It is assumed that the toolkit hides the Room class, in the sense that
it only allows clients to create instances, and calls methods through a
pure abstract interface called IRoom, but not to subclass the existing
concrete Room class (hidden within the toolkit). It is not possible
for the Room to store a strong reference to the wiring object, because
it has been written by a third party and it's closed for change.

There seems to only be one general solution (that I can think of). A
subject needs to hold strong references on its observers.

PROBLEM 5: Nested contract attachments

When OO is used to build complex run time assemblies, it can be
important for observers to connect to their subjects lazily (ie only
when they are themselves being observed). This can lead to subtle
reentrancy problems with attachments.

Consider a namespace object with two child objects named "x" and
"y". The object named "x" contains a formula that refers to
sibling y by name. When the object named "x" is first observed,
it in turn needs to attach to the parent namespace object. The
namespace object in turn has to attach to each of its children (because
the logical namespace is a function of the names of the children), and
in particular attaches to the object named "x".

So when a subject has an observer attach making it lively (ie under
contract to in turn attach to its independents), it is possible that an
upstream independent tries to attach back to a dependent.

Nested attachments don't cause infinite loops. However, they make it
difficult to tell when an object logically has no more observers. The
problem is that some of the observers are associated with a nested
contract, and therefore it is possible that logically there are no
observers, yet physically there are. This is reminiscent of the
problem of detecting cyclic garbage using reference counting.


SO WHAT'S THE SOLUTION? : I have only found one general solution
to these problems. Basically it is to throw away the observer pattern
(unless you're doing a trivial problem that (provably) doesn't
suffer from these issues), and instead use a more general
graph-theoretic approach, such as the theory of one way or multi way
constraints, or the spread *** algorithm etc.

Cheers,
David Barrett-Lennard

.


Quantcast