Re: LSP and subtype



Responding to Huber...

That segues to the notion that the client has to understand _the
whole tree_ to access it properly.  For example, suppose we can't
instantiate Base on a standalone basis and we have:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+
          |                  |
        [Sub1]            [Sub2]
        + foo()

where [Sub2] gets the Base.foo() implementation.  Suppose a Client
invoking Base.foo() is happy with either implementation so all is
well.



Why would the client ever care about what implementation he gets? All Sub1.foo() must respect are the invariants, pre- & postconditions that were established by Base. If Sub1.foo() does so then the client will never notice a difference. If Sub1.foo() fails to do so then the LSP is violated and Sub1 is faulty.


As I said, the stipulation is the Client is happy with either
implementation, so it can invoke Base.foo().


Um, why is the "happiness" of some *particular* client code important?
As long as all foo overrides respect the invariants, pre- &
postconditions that Base documents/relies on then *any* imaginable
client code will run just fine with *any* subclass object of Base, won't
it? IOW, the client just needs to be happy with the documented behavior
(again invariants, pre- & postconditions) of Base and never needs to
know that there are in fact subclasses and overrides.

The relationship between the Client and the member of the tree needs to be defined for each client. When evaluating the collaboration the developer must determine where in the tree the Client will access objects. That decision must be made on a client-by-client basis.


For the tree above it is entirely possible that for a given class of Client objects the only behavior that is acceptable is Sub2.foo(). If that is the case the developer must establish a relationship between the Client class and only the Sub1 class. That is the only way ot capture the constraint on the collaboration that the Base.foo() behavior is not acceptable to the Client.

Now suppose somebody comes along doing maintenance and decides we
need a third implementation:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+--------------+
          |                  |              |
        [Sub1]            [Sub2]          [Sub3]
        + foo()                           + foo()

This may break the original Client if the Sub3.Foo() implementation
is unacceptable (i.e., an LSP violation from the Client
perspective) even though the original Client and its context was
not touched.



This is just an example that derived classes must observe the LSP or else there's a bug in the software. I don't see how this backs your assertion that one should not be able to instantiate Base classes.


That was pretty much my point.  The pitfall from an LSP perspective is
that one cannot arbitrarily add new subclass implementations without
validating them against the existing client access contexts.


Same disagreement as above, I think you can leave client access contexts
out of the discussion.

If Base.foo() and Sub1.foo() are acceptable to the existing clients one still has to make sure that Sub3.foo() is also acceptable. This is a really basic maintenance issue anytime one updates a subclassing tree. If there are overrides it is possible to break the existing clients when an instance of the new subclass is presented to them. To put it another way, validating the clients' access is the only /practical/ way to ensure one is not creating an LSP problem when adding the new behavior to the tree.



Apropos of your point above about DbC conditions, there is really no
way to write such conditions in practice to deal with OO inclusion
polymorphism.  That's because one is not really substituting different
implementations of the same semantics; one is substituting different
behavior semantics.  So the post conditions between subclasses will
usually be different.  IOW, adding a new subclass behavior almost
always changes the base class' ORed postcondition specification.

The only way to effectively deal with that is by raising the level of
abstraction of the base class condition specification.  So if we have:

           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+
                    |           |
                [Gazelle]  [Impala]
                + run()    + run ()

we can implement a Prey.run behavior for the base class and things
will Just Work from an LSP perspective.  But now suppose we add:

           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+----------+
                    |           |          |
                [Gazelle]  [Impala]   [Pheasant]
                + run()    + run ()   + takeFlight()

Now Prey.run() doesn't cut it.  We can fix that by abstracting the
postcondition and providing a Prey.flee() behavior.  However, it we
add:
           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+----------+-----------------+
                    |           |          |                 |
                [Gazelle]  [Impala]   [Pheasant]       [Brontosaurus]
                + run()    + run ()   + takeFlight()   + stomp()

we have a new problem.  We can raise the level of abstraction again
and upgrade the [Prey] postcondition to Prey.respondToAttack().


This just shows that run() was wrong to begin with. I'm suspicious whether
even respondToAttack is right, because technically a predator
doesn't/shouldn't really tell its prey to respond to its attack. I think
the attack should more adequately be seen as an event to which Prey
subclasses will react. That is, in a real-world design there's usually
one more level of indirection. But I guess that's what you're hinting at
below anyway.

run() is only wrong when one does maintenance and adds new subclasses; in the first situation it was Just Fine. To ensure that the Base postconditions will /always/ be valid for LSP one would have to be prescient about every possible requirements change that might lead to providing new behaviors.


[BTW, I agree with you about the naming. That's an OOPL problem because message and method are not separated and one names messages by what the receiver responds. As a translationist I only use state machines for describing object behavior responsibilities. Then events are pure messages that are separate from the response actions. So what I define at the OOA level for [Prey] are the <announcement> events that the [Prey] will accept. It is up to the interface to dispatch the event to the right state transition in the subclasses' object state machines. So I would probably name the event "attack" or "chase". The states in the individual subclass state machines would have state/action names like ran, tookFlight, and stomped (past tense because I use Moore state machines).]

However, the reality is that so long as we keep adding new behaviors
we have to
keep upgrading the notion of the [Prey] behavior.  More important, we
have to check every existing client to make sure the new postcondition
is still acceptable.


Ok, I guess I'm starting to see why you keep saying that one needs to
check existing clients when one adds new subclasses. I agree that you
need to do that in the run --> respondToAttack scenario, because you
have effectively changed the behavior/interface of Prey. There's no way
around that.

I believe that is generally true in OO subclassing because we do routinely provide different behaviors. Base behaviors like "sort" where one can truly just substitute different implementations without affecting the semantics tend to be quite rare. [In fact, I have a hard time thinking of a practical example where such a pure implementation substitution isn't a response to just nonfunctional requirements at the OOD/P level. If the substitution is due to functional requirements at the OOA level the semantics of the behaviors will be different.]



[BTW, this problem exists in no small part because the OOPLs do not
separate message and method because they are all implemented with type
systems and they all employ procedural message passing.  If one only
needed to define the messages that a [Prey] would accept, most of the
LSP problems evaporate.


I don't think that the problems evaporate just because you define
messages instead of methods. Even in a message scenario "run" is simply
the wrong message. Sending an attackImminent message would much better
convey what is happening. In a more complex scenario one would probably
more adequately send appropriate sound & vision events, which Prey
subclass objects use to establish that an attack is imminent. As
mentioned above, Predator should probably not interact with Prey
directly.
I also think that this has only little to do with procedural vs.
message-oriented. In a procedural language there's "only" more work to
do. I.e. you have to implement a message-oriented system yourself or use
a library.

I think they evaporate because if one truly separates message and response, then the message is just an announcement of something the sender did. That means the sender implementation does not to know about the response, much less depend in any way on what the responder does, so LSP becomes academic. Where LSP does come into the picture is when the developer decides who cares about the announcement message and should respond to it. At that point the developer must make sure the responder does something appropriate in the overall solution context.


<aside>
BTW, there is an old R-T/E trick for managing interacting state machines from a DbC perspective. One designs the state machines and their actions independently. Each state will have some precondition that must prevail before it can be executed. Similarly, each state machine action has a postcondition that prevails when it completes. That allows the developer to match preconditions to postconditions to determine where to issue an event. (It's more complicated because of state variables, but that's the general idea.)


However, the same general idea applies to any OO application, even without state machines. Because methods are intrinsic, self-contained responsitilities, one could write all the object methods without including any calls to other methods. Then one could apply the DbC preconditition-to-postcondition matching to determine where to make the calls. That would guarantee implementation independence of the methods.

Now in practice an experienced developer only needs to do that for complicated situations with complex synchronization rules. Usually the developer will know that a Predator is going to attack a Prey and that narrows down the condition matching substantially. B-)
</aside>



That's because separation of message and
method decouples the contexts so there is no expectation of specific
behavior on the part of the client.  Thus there is no contract with
the client
per se.  The message simply becomes an announcement of something the
client did.  The developer could then map that announcement to the
proper response in the interface.  That does involve contract
enforcement but it is at a quite different level of abstraction
(e.g., a UML Interaction Diagram).]


Right, but that's just an artifact of the rather special relationship
Predator <--> Prey. Procedural interaction is usually just fine between
everyday objects like FileStream and Whatever, because Whatever can (and
must be able to) rely on the documented behavior of FileStream. Predator
and Prey are much more loosely coupled.

I could argue that FileStream is a computing space artifact and it is unlikely to change significantly because it is really mathematically defined. Customer problems spaces aren't though, which is what makes constructing software interesting. That is, the OO paradigm is focused on properly abstracting customer problem spaces in a manner that is maintainable in the face of volatile requirements. One technique for doing that is implementation independence, which is what drives the conceptual separation of message and method. One only gets into trouble at the OOP level where they aren't separated. Then it becomes must easier to make caller implementations dependent on callee responses because of expectations arising from encoding procedural-style Do This calls.



To fix this problem we would need to do further surgery on the tree:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+
          |                  |
        [SubA]            [Sub3]
          A               + foo()
          |
    +-----+------+
    |            |
  [Sub1]       [Sub2]
  + foo()      + foo()

Not only do we have to perform more surgery on the tree, we have to
change the calling context in Client to invoke SubA.foo() rather
than Base.foo().  Thus the Client has to understand the tree to
invoke the correct property implementations



How could this ever fix the problem above? Sub3.foo() is buggy, no amount of rearranging the inheritance tree will fix that.


It fixes the problem because the Client can unambiguously access the
tree at the [SubA] level so it never gets the unacceptable [Sub3]
behavior.


Right, that fixes the problem by effectively not calling the buggy code.
I'm not sure whether that's very helpful in practice because someone
somewhere implemented Sub3 for a reason.

The OOP implementation can then ensure the relationship

But it shouldn't be "buggy" code if that is the way the problem space taxonomy works. The fact is that different subsets of Base objects have three different behaviors (four if Base is not abstract) and not all of those behaviors may be appropriate for particular clients. Going back to the Prey example, consider:


             [Prey]
                A
                |
      +---------+---------------+
      |                         |
   [Bird]                   [Antelope]
   + takeflight()           + run()
                                A
                                |
                      +---------+-------+---...
                      |                 |
                  [Gazelle]         [Impala]

Now let's say I have two flavors of [Predator], [Hawk] and [Lion] and assume lions can't catch birds that take flight. If the goal of the collaborations is for the predator to satisfy its hunger, then hawks are not going to attack antelopes and lions are not going to attack birds. (It's just an example so let's not pick on the exceptions.) In that case the behavior of takeFlight for [Lion] and run for [Hawk] are inappropriate and could lead to LSP problems. So neither predator should have a direct relationship with [Prey]; instead their relationships should be with [Bird] or [Antelope].

Bottom line: the fact that certain clients should not access certain members of the taxonomy does not invalidate the taxonomy. One models the problem space for what it is and then manages the collaborations to be consistent.


is instantiated only with [SubA] members of [Base].  The downside is
that the Client must understand where to access the tree so if the
tree changes the Client may become broken.


Agreed, but let's be clear that this is not because something is wrong
with subclassing non-abstract classes or overriding methods per se.

I never said there was anything wrong with subclassing; it is one of the most important tools in the OO arsenal. What I do contend is that implementation inheritance in general and implementation overrides in particular are not good OO practice (in most situations). That's because they open the door to maintainability problems later.


So what's a language that's not poorly formed?

Languages do not define OOA/D methodologies or developer discipline.
Quite often they make compromises with practicality because they are
3GLs (e.g., no OOPL fully separates message and method).  I suspect
that designers leave in features like implementation overrides and
dynamic_cast because it is easier than providing an explicit way to
deal with very specific situations like external libraries and
external object streams.  But that doesn't mean the feature should be
used
routinely.  (Note that you indicated above that you make your base
classes abstract except for libraries, so you aren't using that
feature even though it is there.)


Note that I said that base classes should *usually* be abstract and that
*most* of my base classes are abstract. I do implement non-abstract base
classes from time to time and when I do then I'm usually rather happy
that the language designer did not try to force on me some heuristic
that happens to be wrong in my specific case.

OK, you cited one case where I agreed that it can be useful. There may be a couple of others (though I can't think of any offhand). My assertion is that one should always look for an alternative and should resort to such overrides only when there isn't a reasonable alternative.



Also note that the OOPLs still do a horrendous job of managing
physical coupling even after three decades.  Entire books are devoted
to refactoring techniques just to solve the developer's problem of
having more maintainable OOPL code.  The amount of time developers
have to waste on dependency management at the OOP level boggles the
mind. If
they haven't addressed that problem yet, it shouldn't be surprising
that they haven't addressed the override problem.


I can't really comment on that because I don't see an awful lot of room
for improvement on the language level (but I agree that there is some
room). I think there's a lot more room on the architecture level. If
only architects put more thought in their designs (and refactored them
as soon as they see problems during evolution) then we wouldn't have the
maintenance nightmares we're currently observing on some projects.

Some things like defining the logical structure of a class separately from the physical structure (as opposed to in the same header file) would help a lot. But OOPLs are crippled by being 3GLS and are necessarily bound closely to hardware computational models. The languages used for translation (UML plus an abstract action language) don't have those problems at all. But they are so abstract that they are also independent of particular computing environments so they are already 4GLs.



************* There is nothing wrong with me that could not be cured by a capful of Drano.

H. S. Lahman
hsl@xxxxxxxxxxxxxxxxx
Pathfinder Solutions  -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH



.


Quantcast