Re: LSP and subtype



H. S. Lahman wrote:
BTW, I'm not disputing that base classes should *usually* be abstract
(most of mine are). The problem arises with libraries that provide
classes that can be used out-of-the box and as base classes. If you
follow the base-classes-must-be-abstract rule then you end up with
essentially double the amount of classes, see below.

External libraries like FileStream are a kind of special situation because they do not modify application state variables or invoke application behaviors directly. IOW, because their results are manifested outside the application, it is highly unlikely that there could be LSP side effects for the examples you provided.

I regard this as the same sort of exception as processing a stream of
arbitrary objects from an external source in C++.  There one is
essentially forced to used dynamic_cast because it is the only game in
town for reflection.  But just because it is there does not justify
using it for routine navigation of the application's own subclassing
trees.
IOW, I agree with you that external libraries can be an exception to
the admonition of avoiding overrides just as object streams are an
exception to the admonition to never use dynamic_cast in C++.

I sort of agree (I wouldn't put it exactly this way, because I use non-abstract base classes far more often than casts).

That segues to the notion that the client has to understand _the
whole tree_ to access it properly.  For example, suppose we can't
instantiate Base on a standalone basis and we have:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+
          |                  |
        [Sub1]            [Sub2]
        + foo()

where [Sub2] gets the Base.foo() implementation.  Suppose a Client
invoking Base.foo() is happy with either implementation so all is
well.


Why would the client ever care about what implementation he gets? All
Sub1.foo() must respect are the invariants, pre- & postconditions
that were established by Base. If Sub1.foo() does so then the client
will never notice a difference. If Sub1.foo() fails to do so then
the LSP is violated and Sub1 is faulty.

As I said, the stipulation is the Client is happy with either implementation, so it can invoke Base.foo().

Um, why is the "happiness" of some *particular* client code important? As long as all foo overrides respect the invariants, pre- & postconditions that Base documents/relies on then *any* imaginable client code will run just fine with *any* subclass object of Base, won't it? IOW, the client just needs to be happy with the documented behavior (again invariants, pre- & postconditions) of Base and never needs to know that there are in fact subclasses and overrides.

Now suppose somebody comes along doing maintenance and decides we
need a third implementation:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+--------------+
          |                  |              |
        [Sub1]            [Sub2]          [Sub3]
        + foo()                           + foo()

This may break the original Client if the Sub3.Foo() implementation
is unacceptable (i.e., an LSP violation from the Client
perspective) even though the original Client and its context was
not touched.


This is just an example that derived classes must observe the LSP or
else there's a bug in the software. I don't see how this backs your
assertion that one should not be able to instantiate Base classes.

That was pretty much my point. The pitfall from an LSP perspective is that one cannot arbitrarily add new subclass implementations without validating them against the existing client access contexts.

Same disagreement as above, I think you can leave client access contexts out of the discussion.

Apropos of your point above about DbC conditions, there is really no
way to write such conditions in practice to deal with OO inclusion
polymorphism.  That's because one is not really substituting different
implementations of the same semantics; one is substituting different
behavior semantics.  So the post conditions between subclasses will
usually be different.  IOW, adding a new subclass behavior almost
always changes the base class' ORed postcondition specification.

The only way to effectively deal with that is by raising the level of
abstraction of the base class condition specification.  So if we have:

           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+
                    |           |
                [Gazelle]  [Impala]
                + run()    + run ()

we can implement a Prey.run behavior for the base class and things
will Just Work from an LSP perspective.  But now suppose we add:

           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+----------+
                    |           |          |
                [Gazelle]  [Impala]   [Pheasant]
                + run()    + run ()   + takeFlight()

Now Prey.run() doesn't cut it.  We can fix that by abstracting the
postcondition and providing a Prey.flee() behavior.  However, it we
add:
           1        attacks 1
[Predator] ------------------ [Prey]
                                A
                                |
                    +-----------+----------+-----------------+
                    |           |          |                 |
                [Gazelle]  [Impala]   [Pheasant]       [Brontosaurus]
                + run()    + run ()   + takeFlight()   + stomp()

we have a new problem.  We can raise the level of abstraction again
and upgrade the [Prey] postcondition to Prey.respondToAttack().

This just shows that run() was wrong to begin with. I'm suspicious whether
even respondToAttack is right, because technically a predator
doesn't/shouldn't really tell its prey to respond to its attack. I think
the attack should more adequately be seen as an event to which Prey
subclasses will react. That is, in a real-world design there's usually
one more level of indirection. But I guess that's what you're hinting at
below anyway.


However, the reality is that so long as we keep adding new behaviors
we have to
keep upgrading the notion of the [Prey] behavior.  More important, we
have to check every existing client to make sure the new postcondition
is still acceptable.

Ok, I guess I'm starting to see why you keep saying that one needs to check existing clients when one adds new subclasses. I agree that you need to do that in the run --> respondToAttack scenario, because you have effectively changed the behavior/interface of Prey. There's no way around that.

[BTW, this problem exists in no small part because the OOPLs do not
separate message and method because they are all implemented with type
systems and they all employ procedural message passing.  If one only
needed to define the messages that a [Prey] would accept, most of the
LSP problems evaporate.

I don't think that the problems evaporate just because you define messages instead of methods. Even in a message scenario "run" is simply the wrong message. Sending an attackImminent message would much better convey what is happening. In a more complex scenario one would probably more adequately send appropriate sound & vision events, which Prey subclass objects use to establish that an attack is imminent. As mentioned above, Predator should probably not interact with Prey directly. I also think that this has only little to do with procedural vs. message-oriented. In a procedural language there's "only" more work to do. I.e. you have to implement a message-oriented system yourself or use a library.

That's because separation of message and
method decouples the contexts so there is no expectation of specific
behavior on the part of the client.  Thus there is no contract with
the client
per se.  The message simply becomes an announcement of something the
client did.  The developer could then map that announcement to the
proper response in the interface.  That does involve contract
enforcement but it is at a quite different level of abstraction
(e.g., a UML Interaction Diagram).]

Right, but that's just an artifact of the rather special relationship Predator <--> Prey. Procedural interaction is usually just fine between everyday objects like FileStream and Whatever, because Whatever can (and must be able to) rely on the documented behavior of FileStream. Predator and Prey are much more loosely coupled.

To fix this problem we would need to do further surgery on the tree:

                  [Base]
                  + foo()
                    A
                    |
          +---------+--------+
          |                  |
        [SubA]            [Sub3]
          A               + foo()
          |
    +-----+------+
    |            |
  [Sub1]       [Sub2]
  + foo()      + foo()

Not only do we have to perform more surgery on the tree, we have to
change the calling context in Client to invoke SubA.foo() rather
than Base.foo().  Thus the Client has to understand the tree to
invoke the correct property implementations


How could this ever fix the problem above? Sub3.foo() is buggy, no
amount of rearranging the inheritance tree will fix that.

It fixes the problem because the Client can unambiguously access the tree at the [SubA] level so it never gets the unacceptable [Sub3] behavior.

Right, that fixes the problem by effectively not calling the buggy code. I'm not sure whether that's very helpful in practice because someone somewhere implemented Sub3 for a reason.

The OOP implementation can then ensure the relationship
is instantiated only with [SubA] members of [Base].  The downside is
that the Client must understand where to access the tree so if the
tree changes the Client may become broken.

Agreed, but let's be clear that this is not because something is wrong with subclassing non-abstract classes or overriding methods per se.

Assuming that Base adequately documents its behavior (including
invariants etc.), the client can only become broken because of 2 things:
1. The programmer implementing the client misinterpreted the documented
behavior of Base. As a result, the client could work just fine with
some Base subclass objects but not with others. Obviously, the bug is in
the client.
2. The programmer implementing a new subclass violates the LSP. Even
more obviously the bug is in the new subclass.

Bottom line: overriding implementations in subclassing was just one
of those things that seemed like a good idea at the time but turned
out to be a real Bad Idea.


Reality disagrees with this statement. Most of the real work is done
with languages that allow exactly that (C++, Java, C#, VB, etc.).
BTW, would you care to entertain us with a name of a language that,
in your view, is not poorly formed?

So what's a language that's not poorly formed?

Languages do not define OOA/D methodologies or developer discipline.
Quite often they make compromises with practicality because they are
3GLs (e.g., no OOPL fully separates message and method).  I suspect
that designers leave in features like implementation overrides and
dynamic_cast because it is easier than providing an explicit way to
deal with very specific situations like external libraries and
external object streams.  But that doesn't mean the feature should be
used
routinely.  (Note that you indicated above that you make your base
classes abstract except for libraries, so you aren't using that
feature even though it is there.)

Note that I said that base classes should *usually* be abstract and that *most* of my base classes are abstract. I do implement non-abstract base classes from time to time and when I do then I'm usually rather happy that the language designer did not try to force on me some heuristic that happens to be wrong in my specific case.

Also note that the OOPLs still do a horrendous job of managing
physical coupling even after three decades.  Entire books are devoted
to refactoring techniques just to solve the developer's problem of
having more maintainable OOPL code.  The amount of time developers
have to waste on dependency management at the OOP level boggles the
mind. If
they haven't addressed that problem yet, it shouldn't be surprising
that they haven't addressed the override problem.

I can't really comment on that because I don't see an awful lot of room for improvement on the language level (but I agree that there is some room). I think there's a lot more room on the architecture level. If only architects put more thought in their designs (and refactored them as soon as they see problems during evolution) then we wouldn't have the maintenance nightmares we're currently observing on some projects.

--
Andreas Huber

When replying by private email, please remove the words spam and trap
from the address shown in the header.

.


Quantcast