Re: Question on LSP



Responding to Kazakov...

It is far from trivial. There is aliasing problem. There is a problem of
stored addresses. There is a problem of same thing having different
equivalent addresses in the same context. There is a problem of addresses
depending on the context etc.

Been there; done that. It is not a big deal if you know from the outset that the addresses will move. Since most object addresses in an application don't move, one only has to provide encapsulation and infrastructure for those that do. (Hint: one manages it the same way the OS virtual memory manager does it -- indirection through a table lookup.)

[Some of that encapsulation one gets for free in a well-formed OO application because objects are /never/ directly visible outside the subsystem they live in. So identity mapping to addresses in distributed contexts is always managed in the interface through table lookups of handles anyway.]


So in effect, you have thrown away these machine addesses by replacing them
with some type implemented by the LUT. This is exactly what I meant.
Identity is not God-given and the machine address is its prohet. No.
Identity is just a property as any other, and to implement it you can use
any type you find appropriate.

Not quite. I am saying that object addresses generally do not move so one implements object identity that way routinely. IOW, addresses are the default identity for objects in memory. So long as the addresses don't move the OOA/D referential integrity maps directly and the developer does not need to do anything special at the OOP level. One only has to deal with changing addresses when software has been provided to explicitly move them (e.g., a memory manager) or explicitly decouple them (e.g., a distributed boundary). In those cases the application developer must supply explicit infrastructure in the application to deal with referential integrity.

However, I can certainly understand your reluctance to regard object addresses as identity. Trying to map that sort of identity into a type system requires substantial legerdemain. B-) So you have to leave it as an exercise for the language implementation behind the scenes by making identity implied in the type system. But that doesn't change the fact that an address is a valid identity for an object.

Why construction paradigms cannot be equivalent? In the sense that you were
be able to impartially judge about their applicability in each concrete
case?

Because construction paradigms have different goals. If all we wanted to do was write compact programs quickly and intuitively, we would all be doing functional programming. If our dominant problem is converting an RDB view to a UI view and vice versa, we would all be doing P/R. But OO development has a quite different goal: long term maintainability in the face of volatile requirements for large applications. Different goals require different construction practices.


Huh, it is like to say that different destinations require different
construction of trains. In Europe it was definitely (and partially remains)
so until all countries agreed on one width of the track. Obstacles are
political, not technical.

Nice try but the analogy is at the wrong level of abstraction. To travel some distance over land one can fly, drive an automobile, or take a train. Each mode of transport satisfies different goals for the passenger in different contexts. Each mode requires substantially different transport mechanisms and those mechanism require quite different interactions with the passenger.

Well, clearly, pure interface substitutability has less problems. But there
are important cases where T is not just an interface.

But that's all a type is: an interface to a specific set of properties.


+ an implementation of.


Types basically exist because all 3GLs use procedural message passing by definition.


I beg to disagree. I think types exist because they have to in any more or
less complex logical framework. In that sense 3GL types are
Russell-Whitehead's types.

I agree types have broader application. But whether type systems have broader application or not now, most of the original work was designed to map into the 3GL level of abstraction of the hardware. My point here is merely that the bulk of type system research would never had taken place when it did if there weren't a pressing need to formalize 3GLs.

The type just defines the signatures of a specific set of procedures. (An ADT is just a special case where access of data is cast to procedure getters/setters.) One only has polymorphic dispatch if different objects having different procedures can be accessed using the same procedure signatures (i.e., through the same type).


It is a somewhat "pattern matching" wording (flawed, IMO), but in essense
it is correct. To dispatch one needs exactly one type, that's is a class.


Types [T1] and [T2] are not substitutable for any of the specialized properties of [T1] and [T2] under any circumstances. However, they are <supposed to be> substitutable for the properties defined by [T]. There are two access possibilities:

Yes, but again there are cases where the graph above has cycles. I.e. when
you might want to use T1 and T2 interchangeable:

[T1] <------> [T2]

But only for the properties defined by [T].

Also, I'm afraid I don't see any graph cycle here.


I don't have your nice tool for drawing 2D graphs in ASCII. OK I'll try. If

It's called a keyboard. B-)

you want an equivalence between three types there are many possibilities.
Ones is:

[T]
A A
/ \
V V
[T1] [T2]

Subtyping is transitive. So because T1 is a subtype of T it is also one of
T2, because T2 is a supertype of T. And in reverse, T2 is a subtype of T1.

Picture me jumping up and down screaming, "No! NO! NO-NO-NO-NO-NO!"

Perhaps for purposes of designing some arcane language one can regard sibling subtypes as as somehow being subtypes of one another, but never in an OOPL context. T1 and T2 are semantically completely different types. They reflect /disjoint/ set membership in the problem space. That (disjoint sets) is a matter of methodological definition in the construction paradigm and the OOPL damn well better enforce it. T1 and T2 are subsets of T, but not each other. IOW, one can say unequivocally that: T1 is-a <member of> T; T2 is-a <member of> T; T1 is-not-a <member of> T2; and a T2 is-not-a <member of> T1.

This is an example of the difference between OOA/D subclassing and OOP subtyping. The set membership constraints here might not apply in general to type systems, but in the case of OOPLs they /must/ be honored by the type system.

So the path:

[T1] <------> [T2]

[...]

Peer-to-peer collaboration works for any problem and can be transformed to any implementation environment. Things like tripartite messaging don't. More to the point, broadcasting can always be emulated with multiple peer-to-peer messages so one doesn't need a special construct for those situations where it occurs.


But it is the same agrument RM people are using to justifiy limitations of
their approach. You can prove 1+1=2 in 120 pages proof starting from ZF set
axioms. The question is why youi should do that? Integer might be a set,
but it would be awful to deal with integers in such representation. So the
peer-to-peer collaboration might be complete (though I reserve doubts about
it), but that does not make it KISS. I think double dispatch is quite
natural and intuitive concept. If not then the whole idea of dispatch (and
so polymorphism) was rubbish.

But double dispatch is not readily implemented in all implementation environments. The OO paradigm needs to be implementable in all environments. Moreover, it must be implementable in an unambiguous manner so if one must bootstrap infrastructure to support double dispatch, one opens the can of worms for validating the infrastructure.

IOW, single dispatch through procedural message passing is already implementable in any reasonable 3GL so all the OO paradigm is doing is making sure that OO abstractions can always be mapped directly and unambiguously to procedural message passing. So if the OOP programmer decides it would be neat to implement with double dispatch, that's fine. But it is then the OOP programmer's responsibility to provide a support infrastructure that will behave exactly the same was as the single dispatch solution.

BTW, a litmus test for any theory, starting since Cantor times is - how do
you construct numbers in your paradigm? (RM people are of course unable to
answer this in any reasonable way.) What about your model? Let you have Z
and Q. Along which relationship does "+" sent?

In a practical sense it is not relevant at the OOA/D level. Such things are really only of concern once hardware gets into the picture. So one needs to narrowly define semantic abstractions at the 3GL level for that sort of thing to be consistent with the hardware models. [Which involves compromises, like defining knowledge responsibilities in terms of memory data stores. That, in turn, leads to language pitfalls for access decoupling and the whole getter/setter debate.]

At any higher level of abstraction, one just accepts an existing general model for things like numbers and arithmetic operations. IOW, I don't need to "construct" them in my OOA/D; I already know what they are.


The song remains the same - how do you know that structures complex like
numbers would never appear in the problem space? If you cannot handle
numbers, why are you sure that you can anything else? (This is one reason
why I count nGLs for lower-level abstraction than 3GLs.)

I am sure I can handle numbers because mathematics has provided workable rationalizations for the way the problem space works with numbers and it has provided hardware computing models so that software can work with numbers. The point is I don't need to reinvent that wheel to solve a problem in hand either in OOA/D or in a 3GL. IOW, I don't care how mathematics was able to rationalize numbers; I only care about manipulating them in a manner consistent with the problem space.

It knows that T has Foo, this is what allows you to do Ptr.Foo (). Only
typeless pointers know [almost] nothing. But we don't want them at all.

T knows, but not T*. T* is simply an indirection to get to a T.

Note that the language allows us to use a name like 'T' on the reference as a mnemonic so that the developer can keep track of what is happening with the indirection. Object* would be perfectly acceptable as an indirection mechanism in providing correct code. But the code would be rather unreadable and the compiler would be less able to detect developer screw-ups. IOW, the 'T' is just syntactic sugar designed to address the developer's problems of maintainability and fallibility rather than the correctness of the solution.

I see. Well, it is a very retrograde view, I must say. But let's take it
for a while. It would just mean that you don't have pointer type as proper
type. That effectively closes any discussion about whether a non-type is a
subtype or not. Modern languages have proper pointer types. You can define
new operations on them, you can influence their representation etc. In this
case you cannot wave your hands - it is just a sugar. Because it is not.
Whether you need such things in your model is another question. Note
though, that assuming your sugar to always exist, heavily damages language
performance (prevents some valued optimizations) and plain wrong when you
dealing with things which semantically cannot be referenced (values in I/O
registers, clock readings, random generator realizations).

Retrograde?!?

I didn't say a pointer isn't a type. I just said that its type semantics has nothing to do with the T type.


Hey, type's semantics is up to developer's discretion!

Sure, the developer must keep track of which type has been assigned to each pointer to manage complexity in creating the design, which is why the 'T' is in T*. But that has nothing to do with what a pointer type is. The pointer type is defined independently of T. IOW, the pointer type is the '*' part of T*.

If it has a type it is something generic like Object Reference.

BTW, back to the original point, an Object Reference is polymorphic in the sense that the reference assigned to it can be for any Object subtype (T, T1, T2, S, whatever). I now suspect that is what you had in mind when I originally disagreed that pointers are polymorphic.


Yes, that's a polymorphic pointer.


That's the language designer view.

But I am looking at it from the application developer view. In the context of OOPL semantics it is never polymorphic _once it is instantiated_. The syntactic sugar of T* is just evidence of the fact that a given pointer cannot be assigned to an object of any other type than the one the developer had in mind when instantiating it.


No, that's a different case.

1. The first one is polymorphic pointers. That is when a pointer points to
a class. The target is of either type derived from T. The class is based on
the set of types {T, T1, T2, S ...}. One have to distinguish a pointer to T
(and only T) and a pointer to the class rooted in T. They are of different
types.

I don't think so. In my view you are talking about <at least> four different things here.

(1) The fact that a pointer type can be instantiated to any number of types when it is instantiated makes it -- at best -- a polymorphic type. But that does not mean that the semantics of an object reference is polymorphic.

(2) The semantics of Object Reference can be defined without regard to what the types are of the objects to which it is assigned. Those object types are completely orthogonal to the semantics of being an object reference.

(3) Whether T is subclassed does not matter to the semantics of a reference to a T type. There is only one T set regardless of how many subsets it may have. A pointer to T is a pointer to any object of that and only that set, which has nothing to do with whether T has been further subdivided in some other context.

(4) How an object reference may be instantiated is quite different than how an object reference is used or manipulated. Any type polymorphism only exists prior to instantiation. But once the object reference has been instantiated there is one and only one type associated with it throughout its life, regardless of how many unique objects may be assigned to it during that life.


2. When methods are accessible via pointer. Your language could allow you
to have polymorphic objects of the class based on the set of types {T, T*,
T**, T***, ...}. On this object, of which you don't know, whether it is T
or T* or T** etc, a call to Foo will dynamically dispatch.

Methods aren't directly accessible via pointers in an OO context; only objects are. One of the severe problems with using type systems is that it encourages exactly that view because it marries message and method in the method signature. That alone accounts for a large fraction of the really bad OOPL code around; they are just FORTRAN and C programs with strong typing because that marriage allowed procedural developers to overlay procedural construction paradigms on the OOPLs.

Consider an object whose behavior is described with a state machine. What is the type of the object? It can only be described in terms of the state action methods that it possesses. But the client sends an event message to the object. The object provides a mapping of the event identity and its current state to an action through the STT. Clearly the client does not access individual methods through pointers; it only accesses the object through a pointer to send it a message without even knowing what actions are available.

The fact that procedural message passing has trashed the decoupling of message and method has -- very unfortunately -- been preserved in the 3GL type systems. One of several casualties is the idea that objects collaborate, not methods. In the OO paradigm a pointer can only point to an object, not to an individual method. Mapping to a method can only be done based upon the message identity (and current state for object state machines). IOW, one pointer; many messages and even more methods.

My claim is that formally and technically T<-T1 is nothing better than
T<-T*. The case 1, is irrelevant to the issue, because it is covered by
T<-T* (where "class T" is substituted for T).

I strongly disagree with this in an OO context because of (3) above. You are arguing that T* is equivalent to {T1* | T2*}. That is quite true for polymorphic dispatch *IF* T is subclassed. However, the semantics of an object reference doesn't depend on T being subclassed. The object reference just points to a T, which becomes abundantly clear when T is not subclassed so there is no possible substitution, yet the pointer semantics is unchanged.

IOW, the polymorphic die can not be cast once one instantiates the Object Reference. It can be reassigned to different objects of the same type, but it cannot be reassigned to objects of different types. As a developer, that construction constraint is absolutely essential to maintaining my sanity because referential integrity is a problem /after/ pointers are instantiated.


Yes, but it is just a statement about values of the pointer type. A pointer
to integers is constrained to point to integers in exactly same sense as
integer is constrained to be integer. That does not limit either to become
a subtype of any other type.

We are going in circles. As I said before, I have no problem with subclassing Object Reference to allow it to point to objects, values, methods, or whatever else one might need in some arbitrary language design. My objection lies in attributing subclassing of the thing pointed at to the the pointer semantics itself. What the pointer type is and what the type of the target is are two entirely different things. Just because the target is subclassed does not make the reference subclassed. Similarly, just because polymorphic dispatch is possible for the target does not mean that the pointer itself is polymorphic.

Foo can only be called on T.

That's Foo 1. Foo 2 can be called only on T*. Foo polymorphic can be called
on any (on one the class).

What?!? We seem to be on different planets again. If T* is pointing to a T, then the only Foo available is the one on T. T* is just an indirection mechanism to get to a T. One /never/ gets a different Foo

from accessing through T* than one gets by accessing directly from T.

See above. In my model T* and T are different types. You just cannot call
T::Foo on T*. It is no-no. So you call T*::Foo which is *implemented* as
pointer dereferencing followed by a call T::Foo.

I agree that Foo is not an element of the reference type and I am not suggesting one is calling Object Reference::Foo. (Note that I said explicitly that Foo is only available on a T.) However, the constraints on what the Object Reference type can point to that preclude polymorphism _once it is instantiated_ allow the language to hide the indirection. Again, the language is just providing convenient syntactic sugar to allow a Client to access a T::Foo indirectly through a reference.


It can be dynamically instantiated as in 2. And I disagree that time of
bindings may influence semantics. It could be a sugar, but there is
something you want to sweeten...

Where did I mention the time of binding? I just said that the language is able to hide the indirection because a pointer cannot be polymorphic once it is instantiated.

If one could, one has Anarchy rather than an OOPL. B-)

One could! (:-)) You can define T*::Foo so that it would call T::Bar
instead. In any case, a reasonable implementation of T*::Foo would throw an
exception if the pointer is invalid.

Here I am referring to the notion that T* is inherently polymorphic. Once it is instantiated it can't be or else one has the same sort of anarchy as FORTRAN's assigned goto.


(Maybe assigned goto was the first example of dispatching calls! (:-))

That is the polymorphism. You don't know what it does. After all, T::Bar
could marshal T to another computer, call T::Foo there and marshal the
result back. But if that fulfills the contract, why do you care?

I agree that the assigned goto is polymorphic. My issue here is that some forms of polymorphism lead to software anarchy and the OO paradigm ensures that doesn't happen by putting constraints on polymorphism -- such as a pointer cannot be polymorphic once it is instantiated. IOW the OO paradigm simply does not go to the places where FORTRAN's assigned goto lives.

When the client invokes Foo, it is always from T, regardless indirection or what language implementation mechanism is used for addressing a T. That's why modern OOPLs don't make a distinction and always address T.Foo. The fact that the language always introduces a hidden T* is a pure language implementation issue.

Maybe, but I don't want implicit assumptions. If T* is a side effect, then
one cannot talk about identity. If it is an intentional choice to achieve
identity, then that should be made explicitly.

Well, personally I much prefer the always-a-reference approach. It leads to a lot less foot-shooting.

But it is fundamentally inconsistent with always-an-object approach. You
need things which aren't references. Reference itself is not reference. You
will lack too much reflection if you would try to pursue this approach it
its logical ends.

I don't see an inconsistency. [In fact, I don't know what the always-an-object approach is. B-)] The reference is always assigned to exactly one object so referential integrity is unambiguous about identity. I also don't see why one needs problem space object embedded in the implementations of other problem space objects. (For computing space brickabrack like Array and String that the OOPLs provide, it is a different story because those things describe the implementation of the object.)

As a practical matter languages like Java that use only references seem to be a lot easier ot learn and use than languages like C++ that provide both embedding and references.

I also don't see what reflection has to do with this issue.


Is reference an object? Does this object same that it points to? When you
aggregate references, is the aggregate of referenced objects or target
objects? Is the aggregate itself an object? Is it a refernce? Do you need
objects, references, aggregates of objects, aggreates of references,
references to references, reference to aggregates of references of
aggregates, and so on and so far all distinct entities, hard-wired in the
language?

In order:

An object is not a reference. A reference is a 3GL implementation artifact that is independent of problem space entity abstractions (even if someone chooses to make it an object is in some language meta model).

The reference is entirely separate and conceptually different than the target it points to.

The aggregate is of references. Is this a trick question? B-)

Yes, but usually a computing space object. So the notion of 'object' may be somewhat flexible across 3GLs.

An aggregate is not a reference (though is may be accessed by references and it may aggregate only references).

References in an OOPL are not necessary but they sure make things much more convenient. For example, it would be very difficult to implement an OOPL in preprocessor mode (a la cfront or Eiffel) over a language that did not support references and direct address manipulation.

[Note that there is no notion of 'reference' in some of the abstract action languages used for OOA/D modeling. Those that do provide it usually use it simply to mean an abstract handle for an object returned by a relationship navigation construct.]

However, I still don't see what this has to do with reflection. Or are you talking purely about language implementation mechanisms?


*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH



.



Relevant Pages

  • Re: Resolution of virtual functions
    ... >> called through three levels of indirection. ... > the code if it is ever invoked through a reference or a pointer. ... > to inline a virtual function at a particular location if, ...
    (comp.lang.cpp)
  • Re: Resolution of virtual functions
    ... As soon as you call them through a reference or pointer, ... that it is unlikely that a compiler will be able to inline a virtual ... I'm not sure what you meant by "three levels of indirection." ...
    (comp.lang.cpp)
  • Re: stone-giants in the hobbit just trolls?
    ... accept a pointer to it if I've simply forgotten it. ... I can find no reference to Giants except in the context of describing the ...
    (rec.arts.books.tolkien)
  • Re: C : Call by value or reference
    ... i see that there is nothing like reference in C standard but it is ... array each time the function is called and thus would be rather ... found in such a context it is converted auto- ... this value (i.e. the pointer to the first element of the array) ...
    (comp.lang.c)
  • Re: function cant change a pointers address
    ... I'm passing a pointer to a function that can't change the pointer's ... level of indirection (via pointer or reference) to change "pg" parameter, ...
    (microsoft.public.vc.language)