Re: Observer pattern limitations



On 15 Jul 2006 18:05:09 -0700, David Barrett-Lennard wrote:

Dmitry A. Kazakov wrote:

It is an independent issue. I prefer to do it at once to minimize
distributed overhead of dereferencings. Lazy objects are implemented in a
different way. The object stays resident in the DB, only its small proxy
becomes memory-resident. That proxy can do cache-on-demand, with the same
effect as lazy references. It is more flexible, IMO.

Here is an attempt to understand you...

By "at once" I assume you mean synchronously or eager (rather than
postponed). I assume you are talking about some traversal over the
modified objects so that all the changes are written synchronously to
disk (apart from write caching). Furthermore this discovers new
objects that haven't been written to disk before. I presume they need
to be given some kind of persistable object identifier (OID).

Yes.

Somehow,
you are able to ensure that a consistent snapshot is written to disk.
You call this whole process "finalisation".

Finalization triggers the process. A weak reference held by the storage
object becomes a notification, before actual finalization starts. This
gives the storage object an opportunity to sync the memory object (which is
still fully functional), just before it gets finalized and then freed.

Are you familiar with stub-scion chains? Is that what you're doing?

In some sense yes. But the concept is IMO wrong from a wider perspective of
type system. I consider references and pointers as subtypes of the target
type. So a stub to me is just a subtype, which implements the behavior of
the target type through marshaling, caching, whatsoever. In this sense it
is merely an implementation detail, not to be exposed.

Because you might wish to collect them.

I assume you mean GC.

Consider A, B -> C.

I assume you mean A,B,C are objects and A,B independently contain
references to C. [You are *not* talking about a dependency, ie that C
is calculated from A,B]

Right

Then A gets
memory resident it forces C as well.

I assume you mean the reference to C from A is followed, causing C to
become memory resident.

Later on, A is destroyed. Then, you
have to decide if C should vanish only from the memory or from DB as well.
You cannot do it because B is out.

I assume you mean B is not resident in memory.

Yes

In my system memory-resident C is
finalized. Upon this the DB GC is run to determine whether C has to be
removed from the DB. If B exists, C is synced to DB. If not, C is removed
from DB.

IMO there are two important views of the system, and therefore two
types of GC. There is a "transient view" which considers all objects
in memory (both independents and dependents), and a "persistent view"
which only considers the independents.

If you take the pure functional dependency perspective on the dichotomy
of dependents versus independents, then *by definition*, dependents
contain references to independents (or other dependents) but never the
reverse. Dependents only exist for the purpose of caching the results
of calculations. The ability to navigate from independents to
dependences simply exists to allow cached results to be found or to be
marked as dirty.

There is a theoretical sense in which all the dependents can be
completely eliminated from the system without impacting the program
functionality. After all, caching is only for performance. For that
reason, it is possible to persist the independents without the
dependents, and it is also possible to GC the independents without the
dependents. ie by specifically not following the navigation links from
independents toward dependents. These are exactly the "observer
pattern" links - ie when a subject is able to notify its observers by
storing pointers to them.

Yes. But note that the same object can migrate from one camp to another.
Unfortunately, it is impossible to stay fully functional, because some
objects are quite large for "cut'n'paste."

Firstly, that is a requirement. Secondly, to me it is not obvious why
persistence should slow down anything. A persistent object gets one weak
reference more. It is not a big deal.

I don't understand you.

A memory resident object that persists has a weak reference from the
storage object where it persists. This has no [sufficient] overhead.

What I mean is that dependents include things like screen pixel values.
I assume you agree it would be very inefficient to persist snapshots
of the screen. How do you reconcile that?

I'm not sure what you mean. But bit blit is that thing. This is a different
theme, but I designed a widget library for real-time wave forms and list
views using this approach.

Do you mean writing modified
persistent objects back to disk?

Yes, if the finalized object persists and was modified.

How do you ensure integrity if objects are only written to disk when
they are finalised? What if there is a power failure, so your data is
in an inconsistent state?

It is not, because if a reference to a proxy object is held , that means
that this object is potentially being modified. To sync it prematurely
would mean to bring the system in an inconsistent state.

I'm talking about the problem of a
consistent cut or snapshot.

It is driven by user actions. The objects never know if a system they
comprise is consistent. It would be bad OO design. Each object is
responsible for only itself. A memory-resident object maintains its
invariant. The interplay of invariants is user's responsibility. He can
combine objects into larger objects to enforce some higher-order
invariants. That's OO design is all about.

There is no "durability", no such thing. There
is a larger system which includes the DBMS, that system is never off. If
you don't need to update the object, then that means that logically it is
not in the scope of the larger system. Quite simple, isn't it? ]

I'm struggling with understanding what you mean. In that sense I'm not
finding this simple at all!

Why? Each object has a scope which determines its life time. The scope of a
persistent object is one of the system where it persists in. What is called
"persistent" object is just an object that exists in the scope of DBMS, DSN
or call it as you wish. For software design it is no matter whether that
system is a crystalline material exhibiting ferromagnetism with some junk
SQL interface or anything else.

I have. For all intensive purposes it is just as fast as a system
semaphore.

That's interesting. Do you use Windows or POSIX? Can you outline the
algorithm?

I use the semaphore in Win32. I last measured it some time ago on a
2GHz Pentium IV. It can be locked then unlocked in 2us. I used the
standard algorithm for exclusive write or shared read locks and
obtained the same performance because the time is dominated by the
semaphore. Note that once a shared read lock has already been gained,
subsequent shared read locks are granted far more quickly because they
only lock a Win32 critical section to protect the read count, and a
critical section is about 25 times faster than a semaphore on that
machine.

This is what I'am doing too. First take a critical section, then look at
the write count, increase the read count then release it. If write count is
set, then release and wait for an event. If write is required, then spin
for a mutex, look for the read count ... It is much slower than plain
critical sections. Especially because there are tricky places where race
conditions need to be pounded out. Someday I'll give it another try in Ada.

That is the case for intervals and fuzzy numbers. The problem there is
aliasing, not side effects. For both the result of a*b depends on whether a
and b are *independent*. For example intervals: [-1,2]*[-1,2]=[-2,4], yet
[-1,2]^2=[0,4].

Hmmm. I think the functional programming perspective is too important
to ignore.

I just described my perspective in some detail in a post to Leslie
Sanford. I would greatly appreciate it if you could have a read and
comment.

I don't ignore it, I just point out its limitations. Some of them are
fundamental, like need of independence analysis above.

I don't see that!

That's a fact of your own biography, then thing is a hard mathematical
reality, you're platonic, after all! (:-))

In the case of intervals the result of a*b depends on the joint
distribution of a and b over R. When they are independent, then the
distribution can be represented as a multiplicative of individual
distributions (min is the multiplication operation in that case). If a and
b are not independent (as in the case of x*x), then the result of * is
inaccurate (it remains an estimation from above, in some sense, which makes
interval arithmetic sound, but that's another story).

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
.



Relevant Pages

  • Re: Observer pattern limitations
    ... object becomes a notification, before actual finalization starts. ... to keep track of what has been faulted into memory. ... dichotomy between independents and dependents. ...
    (comp.object)
  • Re: Observer pattern limitations
    ... objects that haven't been written to disk before. ... to keep track of what has been faulted into memory. ... dichotomy between independents and dependents. ...
    (comp.object)
  • Re: Observer pattern limitations
    ... to keep track of what has been faulted into memory. ... or it may be associated with an OODB on disk. ... dichotomy between independents and dependents. ...
    (comp.object)
  • Re: [patch 15/15] PNP: convert resource options to single linked list
    ... independents precede dependents, but does not guarantee. ... that some piece of ISAPnP hardware out there actually changes behaviour ... if (ret == 0) ...
    (Linux-Kernel)
  • Re: Observer pattern limitations
    ... Further I distinguish references in memory to ones into the store. ... objects that haven't been written to disk before. ... I don't really see why you would want dependents to persist on disk, ... which only considers the independents. ...
    (comp.object)