Re: object system...




"H. S. Lahman" <h.lahman@xxxxxxxxxxx> wrote in message
news:tYs9l.4369$BC4.1706@xxxxxxxxxxxxxxxxxxxxxxx
Responding to cr88192...

If yes, then you are wrong, because there are even no one OO language
with perfect perfomance.
There is no 3GL that provides perfect performance;
for that you need machine language.
I disagree. General purpose languages of xGL are not desined to
decrease performance of previous (x-1)GL.
It's true that they aren't *designed* to be slower. But the price of
abstraction is performance. Any nontrivial C program with be 30-50%
slower than hand-tuned Assembly. Any C++ program will be 50-200% slower
than the C program.


not always the case.
C is usually fairly comparable to "mundane" assembly...

I'm afraid that is just not true. See my response to Riemenschneider. C
isn't even as fast as other systems programming languages.


yes, but then again, one can debate then what is "mundane" assembly...

some of us tend to write assembly code that looks more than a little like
what the C compiler typically produces...

so, some of us begin to think in assembler, and about the first thing that
comes out on the screen is:

_foo:
push ebp
mov ebp, esp
sub esp, 48
mov [ebp-4], edi
push [ebp-8], esi

....

mov edi, [ebp-4]
mov esi, [ebp-8]
mov ebp, esp
pop ebp
ret

so, yes, that is comming off to a good start...

and so, yes, ones' assembler code typically does perform similarly to their
C code in this case, and this is what I meant by "mundane" assembly (as
opposed to hand-optimized assembly, where one will typicall trim down the
prologue and epiloge, refrain from using ebp as a frame pointer, try to keep
variables in registers, ...).


much of the time, C++ is performance competitive with C (assuming of
course that the C and C++ code operate similarly, for example, the C++
code looks like C, or the C code uses an object-based approach).

The only conceivable situation where this would be true is when one is
using C++ as a "Better C". In that situation one essentially has a C
program with strong typing and is not using any OO constructs more
complicated than a simple Class.


yes, this is actually what I meant...

in the case where one uses C++ almost exactly the same as C, the compiler
output is usually very similar...

and, by the time they are using lots of heap-allocated memory, tables full
of function pointers, lots of wrapper functions and trampolines, ... then it
performs much more like C++ when using lots of OO features...


C++ is a versatile language, as one can use it in some places mostly like it
were C, and in other places, mostly like it were Java (I have seen C++
projects where probably the majority of the codebase is contained in the
headers, and long and elaborate '#include' chains are used to hold the whole
thing together...).

now, which case is faster?...


Note that the commercial full code generators for OOA models routinely
target C or Assembly instead of an OOPL in situations where performance is
important. They do so even though it would be easier to produce OOPL code
from an OOA model.


yeah...


there are many other cases where "abstraction" can be leveraged to the
advantage of the compiler writer, allowing the production of more
efficient code than would have been produced otherwise.

a simple example of this would be compiler provided complexes or vectors,
where a compiler may be able to leverage the capabilities of the
processor (such as specific opcodes for vector handling, ...), and thus
generate faster code than would happen if the coder were to just write a
big mass of scalar code.

The very best an optimizing compiler could possibly do is write code that
is exactly as fast as that produced by a competent Assembly developer. But
the compiler can't understand the specific problem being solved. So there
will always some problem solutions where the Assembly programmer can beat
the optimizer by tailoring the Assembly to that specific problem. Been
there; done that.

The only reason those cycle-counting Assembly gurus are a vanishing breed
is that it takes them too long to do it. So the economics drives putting
money into optimizing compilers that *usually* do *almost* as good as the
Assembly guru.


yeah...

but, alas, this is back to the issue of said guru consistently applying
optimization, which is slow and requires lots of thinking.

the compiler does it faster, and a coder when faced with a larger problem
will usually resort back to slow-ass approaches (and/or huge amounts of
copy/paste).

the usual approach then is to do most of the codebase in a language like C
or C++, and only fall back to ASM when needed for performance reasons (so,
we use ASM for some tight vector-math code, but not for things like the
windows event callback, ...).


but, the point was, for example, in C is it more efficient to make use of
the compiler-provided complexes (such as the "float _Complex" type), or
hand-write ones' own complexes using scalars, function calls, and
structs?...

in general, this compiler provided abstraction, will somewhat beat out the
performance of doing it manually.

for example:
c=a*b;

may generate a chunk of specialized x87 or SSE instructions, but:
c=myapiMulComplex(a, b);

involves the function call overhead, the cost of copying the structs around
on the stack, much less efficiently crafted ASM output (since the compiler
may lack most of the context from the caller), ...

this is especially in the case where the processor provides specific
facilities that help with an operation (such as SSE), that can be exploited
easily by built-in abstractions, but would have been missed had the dev
made use of lots of scalar code instead (most compilers prove incompetent at
making effective use of SSE for optimization tasks apart from explicit use
of special vector types and similar...).


so, yes, not all abstraction is bad, or kills performance, but rather only
those feautres which necessaily hurt performance in order to make use of
them (say, if, rather than adding some specialized functions, the user
considers "well, maybe I will use more generic functions and a whole lot of
conditionals based on dynamic_cast?...").



or a compiler for a GC'ed framework may notice that an object never
survives past the end of a function call, and may thus use a more
efficient means for allocating the object (such as stack-based memory),
whereas a compiler for a language with manual memory management may fail
to take this into account (so, for example, not only do we have a
new/delete pair, but the object also goes on the main heap, ...).

LOL. If you are using a GC language, performance clearly isn't a high
priority.


maybe true...
none the less, the addition of GC does not "always" make the thing slower
(such as when the compiler may notice that it doesn't have to allocate
anything at all...).

granted, yes, the C coder would probably notice this case as well, and put
the data in question on the stack...

but, we also have many people who use 'new' and 'delete' to allocate small
non-passed arrays which don't survive past the end of the funtion...


We use OOPLs for the abstraction and
logical decoupling they provide, not performance.
I disagree. We use OO desing to remove all of the operations, that can
be done automatically, by compler, without any performance lost, not
in order to use abstraction as self-target.
Say, what?!? Abstraction is used to boost developer productivity.
Logical decoupling is used to boost developer productivity during
maintenance. Both are indirectly used to improve reliability.
Performance is the price one pays for the productivity and reliability
gains.


again, not always.
they are not mutually opposed, but very often, it is possible to increase
both sets of goals at the same time...

Not IME.

In the early '60s a 1 MLOC application was regarded as gigantic and I
recall an estimate that it would require 1000 engineers working for a
decade to complete such an application. And defect rates in those days
were in the area of 150 defects/KLOC.

We've come a long way since then so that MLOC applications are
small-medium size, they are developed by a group of 8-20 engineers in a
year, and defect rates are down in the 0.5 defects/KLOC range. But you
need a desktop with the computing power of a 1984 mainframe to run
exactly the same spread*** that ran in 1984 on a TRS-80 or an Apple I.

We just don't notice the performance hits paid for greater productivity
because Moore's Law has been working for hardware for nearly half a
century.


of course, this issue is not so much about the languages themselves, or
the level of abstraction, but rather about how the apps are
implemented...

the modern spread*** also operates in a GUI environment, and typically
drags along a huge mass of system-related libraries, not all of which are
particularly efficient...

Sure. But the point is most of that stuff is focused on making life easier
for developers. Interoperability infrastructures provide silliness like
remote object access because it saves developers a few keystrokes.

I don't care what the application is or what its environment is like. It
will be faster and have a smaller footprint if it is written in Assembly
than if it is written in a procedural 3GL and it will be a lot faster and
a lot smaller than if it were written in an OOPL so long as the
environment is the same. But the OOPL code would also take less time to
develop and it would be more reliable. The market has passed judgment that
that time and reliability are more important, in part because the hardware
improvements compensate for the speed and footprint.


yes, this is reasonable...


What? Why? Incredibly. Any void concrete implementation is the best
of
all others, but unfortunately nearly always unreachable.
Note that an object is defined by its responsibilities. By definition
those responsibilities are obligations to know or do something. The
interface represents access to those obligations. So the object should
never have an interface to access not doing something.
"The interface represents access to those obligations" - I deisagree.
It is possible, we argue about terms only.
OK, let' try this from the Dictionary of Object Technology:

"Interface n. 1. (a) any specification of the boundary of something in
terms of possible interactions or properties that are visible across
that boundary. (b) the visible, outside, user view of something. 2. the
messages to which an object can respond."

That sounds a whole lot like access to me. Also note that in UML an
Interface is a separate model element from a Class. That Interface
element provides an explicit mapping between incoming messages and the
operations or attributes of the Class.


in my case, I consider it more in this latter sense, or more specific, as
in the sense of "interface" as used in languages like Java or C#...

I figured that out when I realized you were a type maven. B-)


?...


3GL type systems are a compromise and one has to be careful about letting
them drive design -- such as believing the interface defines what an
object *is*.


yes, this much is defined by the class and the in-memory layout...

(ok, yes, the above is not serious, but I think I get the idea though...).


"Programming based on interfaces" is a key of OO desing. Interface is
primary part of an abstraction. An appearence of an abstraction linked
with its interface, not with its implementation. Speaking about an
abstraction we are speaking about its interface.
This is kind of Motherhood & Apple Pie. Encapsulation is a critical
notion in the OO paradigm. So collaborations should be defined in terms
of interfaces. The point is...?


this can also be related to modularity...

so, an interface is an abstraction over a single class, and a module
would be an abstraction over a whole collection of classes (an entire
subsystem to be implemented hidden behind a singular externally-defined
API).

this thing frastrates me with many OO devs, which focus all their
attention on micro-abstractions (such as individual classes and
interfaces), but fail to apply similar principles to the larger-scale
project (often producing codebases which are structured as some big
elaborate tree with damn near everything inheriting from everything
else).

Not me. If you look at the "Application Partitioning" category of my blog
it will be clear that I regard large scale modularity as critical for
large systems and I believe large scale reuse is much more valuable than
traditional object reuse.

Also, if you have followed my forum posts I am on record as advocating
that one use generalization sparingly, keeping generalizations as simple
as possible, and avoiding complexities like implementation inheritance.


yes, ok.

I am not sure I have looked at any of this...




now, some of us code mostly in C, and have learned to see this issue, if
anything, because if one does not maintain strict modularity and
abstractions, things can quickly start to turn horrible... and, so by the
time one reaches the level of working on 500+ kloc projects, they have
long since learned the need for modular approaches...

this does not mean that this issue goes away in OO languages, only that
OO is "flexible" enough to delay the hard consequences a bit longer,
causing many devs to proceed to form, once again, huge-scale tangled
messes of code (or, for that matter, use the features to conviniently
create constructions which are just distasteful...).

and, then one can see the consequences of this when one proposes...
breaking the app itself into multiple pieces...

OK, but I still don't see how this relates to "programming based on
interfaces". The OO paradigm provides lots of tools for making
applications more maintainable; militant use of interfaces is just one of
them. In addition, APIs have been around since long before OO. So if I was
looking for a thumbnail description of OO design, I would probably go with
something more unique and important to OO, like "programming based on
problem space abstraction."


yes, ok.


So "interface" defines all logical properties of its abstraction (what
to do): responsibilities, access, protocols (sequences of usage) etc.

The dividing between interface ("what to do") and implementation
("how to do") is artificial, not natural of nature, but anyway
"object" never can "define its responsibilities", because "object" is
an instance of interface.
Ah. I see. You are a type maven. B-) So we need a little primer here on
OOA/D.

Types do not even exist in OOA/D (other than knowledge ADTs). OOA/D is
based on class systems rather than type systems. The 3GL type systems
are a compromise with the hardware computational models. As a result
they do not reflect a number of fundamental OO design issues.


and, some of us really value conventional value types as well...

???


this seemed like you were advocating dispensing with any of the built in
value types, and instead implementing "everything" as a class instance
(implying then, for example, that numerical types are pass-by-reference,
have methods, ... and that operations produce new instances of the class).

usually, this quickly leads to a "slippery slope" of horrible performance
(and the garbage collector going berserk...).

even then, as I see it, even if the performance impact is avoided (value
types are implemented normally, but just fake having a class), I don't feel
this to be an ideal design (our built-in value types just don't need huge
masses of methods).

I am much more inclined to think of implying a generic manipulation to a
type, and not of having operactions be a method of the type.

so, I would much rather type:
sin(PI/2);
than:
(PI/2).sin();

even if it were possibly the case that the above were, in some cases,
implemented as a piece of syntactic sugar. such as no generic 'sin' function
existing, but the language having a trick to allow methods to be called on
an object following the same syntax as an ordinary function call.

however, such a trick could lead to potential semantic issues (such as, what
is 2 'sin' functions exist, with one inside the object, but the other in the
caller's scope...).

class Foo
{
Foo foo_add(Foo this, Foo that);
}

'this' as a method argument implying that the method be called as if it were
a function accepting itself...

or whatever...


The most important such issue in this thread context is the separation
of message and method. Because 3GLs all employ procedural message
passing there is no distinction; the message is simply the procedure
signature. Since we name procedures by what they do, the message becomes
a very specific imperative to do something (Do This). Unfortunately,
that implies that the sender of the message knows (A) that the receiver
exists, (B) that the receiver does something specific, and (C) that
doing it is the next thing to do in the solution. All of those things
trash OO encapsulation and implementation hiding because it makes it
easy to construct the sender so that it depends on what the receiver
does.


yes, this is typically the case, but not always the case...

for example, it is possible to implement a declarative API in terms of
linear calls, where each call serves to inform the callee of the next
element in the structure, rather than telling it what to do about it.

The use of "call" here is very telling. One of the problems with 3GL type
systems is that they are married to procedural message passing. That
invites one into a procedural Do This view of the solution as a sequence
of operations to be decomposed. That leads to the hierarchical
dependencies of the legendary Spaghetti Code. The OO paradigm strives to
eliminate those hierarchical dependencies. But because of the 3GL type
system compromises the way one must do that is with a different mindset
towards design, not OOPL programming.

So it doesn't matter how the interface is defined. The critical mindset
for OO development is that an interface only defines messages, not
operations. Once one has that mindset, one can construct objects without
worrying about sequences of operations and messages can be announcements
(I'm Done) rather than procedural imperatives (Do This). That allows one
to connect the dots for flow of control later by routing messages.


oh, ok...

yes, I think I know what you are getting at now, only that this particular
style is often difficult in C apart from making dependency issues worse
(either the creation of bi-directionally dependent APIs, the ugly use of
lots of callbacks, ...).

a solution I had found before is that each side can provide a function
table, which can then be filled with functions from the other side, allowing
the code to be "soft-linked" (either one party or another, or a 3rd party,
is then responsible for fetching the tables and filling them with the
appropriate functions, in order to set up communication between them).

something like this could also be done, but much more cleanly, in an OO
language.

(actually, I have several subsystems that are linked together via interfaces
like this).


a similar approach is regularly used in my case for tasks like "service
registration", where there is a general-purpose API (such as a virtual
filesystem, typesystem, ...), and each "service provider" interfaces with
the API to get a method table with its name on it, and then proceeds to fill
in the table, and possibly another step to give it back.

all of this can be done while creating minimal dependencies (for example,
the "type" which is registering itself need not know of the details of the
internals of the typesystem machinery, and the typesystem machinery need not
know anything about the type being registered).

I also use things like this sometimes inside my parsers or compiler
machinery (for example, for parsing specific constructs, generating code
fragments for particular constructions, ...), ...


For example, in the OOA/D one can define every object and every
responsibility fully before defining a single message or collaboration.
One can then go through the behaviors and connect them them with
collaboration messages. (There is even a DbC formalism, albeit rather
tedious, to rigorously ensure that the dots are connected correctly.)
[Experienced OO designers rarely do this because they already have a good
idea about flow of control because of the way the necessary objects and
responsibilities are defined. But it is a standard technique in tricky
situations like distributed processing.]


ok.


IOW, one does not think about sequences of operations (calls) until the
very last thing one does in an OO design. Nor does one think in terms of
one object doing something for another object. One designs object methods
in complete isolation from the solution context. It is only in the DbC
context of connecting the flow of control dots that one thinks about
contracts in terms of 'client' and 'service'. But then the 'client' is
quite generic; it is *some* object that accesses the service, not a
particular object.


well, this is different...


[For example, in a well-formed OO application only knowledge attribute
getters should return a value. If a behavior method returns a value that
creates an instant dependency between the caller and implementation of
the receiver. That is, the caller depends on the *how* the receiver
computed the value because the value must be correct for the caller's
own specification of what it must do. Note that the 3GL type systems
can't enforce that distinction because the type distinction between
function vs. procedure is orthogonal to the semantics of knowledge vs.
behavior responsibilities.]


yes, of course, we can often return handles as well...

It turns out that the abstract action languages used for OOA/D are very
restrictive about that. (In fact, one AAL I know of was so set-oriented
that it didn't even have syntax for accessing an individual object.) The
only time you can do that is in syntactic constructs that describe
relationship navigation within the model. It is also a true handle in the
sense that its properties are not exposed; all one can do is send messages
to it.

Corollary: in the AALs you cannot pass an object handle as part of a
message data packet between objects in the same subsystem. The receiver
must navigate an existing relationship path itself to get to the object.


odd...


The reason is that object references represent the most egregious form of
coupling because the receiver has an unlimited license to do unexpected
things. Another reason is that it is none of the message sender's business
who the receiver collaborates with.


my usual practice here is to severely limit what can be done with a handle
by making it some opaque type (often either a void pointer or an incomplete
struct type, where the void pointer is better for more "generic" situations,
but the incomplete struct is a lot better at complaining about type
conversions if one accidentally uses the wrong handle in the wrong place,
which can be rather useful to have the compiler warn about...).

so, an attempt to directly manipulate or look into such a handle is not so
much of a temptation.
in many of my APIs, it is context dependent whether or not the handle even
points into addressable memory.

an example being that it is actually an integer handle embedded in a fixnum
(this being done, for example, because the compiler does not complain about
implicit integer conversions, but does complain about pointer
conversions...).


Of course the OOPL type systems allow passing object references around
indiscriminately. Why? Because identity is critical to the OO paradigm and
in the context of hardware a memory address is a very efficient mechanism
for managing identity. But just because the 3GL type system lets one do
something doesn't mean it is a good thing to do. So the AALs place
restrictions to prevent foot-shooting that the OOPLs allow.


ok, yes.



Contrast that with OOA/D where message and method are separated. Now a
message is an announcement of something the sender did to change the
state of the solution (I'm Done). All the sender needs to understand is
what it did. So there can be no dependency in its implementation on what
will happen. It is up to the developer to connect the dots of flow of
control so that the message goes to some object that cares about the
state change and can respond appropriately. (In UML that is done at an
entirely different level of abstraction in an Interaction Diagram.) Thus
the separation of message and method is one of the core mechanisms of OO
decoupling.

When message and method are separated, the role of the interface is
different. It exists to provide a mapping between announcement messages
and responses. That's why Interface is a separate model element from the
Class it encapsulates in UML. In OOA/D class systems, the Class defines
the responsibilities that an object has without regard to context; it
describes the intrinsic nature of the object. That nature is mapped to
context through the interface.

But when message and method are coincident as in the 3GL type systems,
there is no distinction between its intrinsic nature and how it is
viewed in the context of the solution. So in 3GL type systems the
interface also defines what the object is.

That's why it is important to understand OOA/D prior to launching into
OOPL coding. If one gets the OOA/D right, then the sender method will
not depend on what the receiver does because it doesn't know anything
about the receiver. Then the fact that the 3GL type systems have trashed
OO decoupling is not a problem.


?...


what is the problem then with seeing the objects in terms of "something
that is"?...
we have objects, we manipulate objects, and we assemble them into larger
structures.

That is pretty much what I mean by mind set. B-)

First, objects are abstractions of problem space entities. The object is
defined by what the underlying entity *is*, as represented by the
responsibilities that it abstracts. But an interface is not an object and
it does not define the object; it is separate mechanism for mapping
messages to the object's responsibilities. It is only the 3GL type systems
that are confused about this because they are necessarily bound to 3GL
procedural message passing where there is no separation of message and
method.


ok.


Second, objects are not manipulated (unless you were using an imperial
'we' to refer the developer creating the application). Objects simply
exist and respond to messages. An object's response changes the state of
the application and that change, in turn, is announced as a message. But
the object triggering that message doesn't know what object will respond
or what, if anything, they will do in response. Flow of control is defined
by the developer in the separate activity of routing the message. That
mindset implicit in separation of message and method is absolutely
critical to good OO design.


ok.

as for 'I', 'we', ... all this can get confusing when talking about code, as
where exactly does one draw the line in terms of responsibility between
themselves and the computer, or the line in identity between themselves and
the data they are thinking or talking about?...

but, alas, if this kind of thinking is let run out of control, it leads to
all sorts of existential issues...

I guess on some level, the coder may "fuse" with the code they write, ...

but, oh well, I will probably not go too far, as philosophy is an area I
have since learned not to venture too far into...


Third, objects are not assembled into larger structures; that is much more
of a functional programming concept. There are only three basic levels of
abstraction in an OO application: subsystem, object, and responsibility.
The elements at each of those levels are peers and they collaborate (send
messages to one another) directly as peers. They are logically indivisible
at the containing level of abstraction. The only "structures" one can form
are through relationships (e.g., a GoF Composite pattern).


ok, it may depend that I have a good deal of exposure to FP as well, and so
my thinking and coding practices often mix together FP and OO approaches
(along with good old imperative and procedural approaches, ...).

I tend at times to create lots of code that "composes" and "assembles"
things from common smaller components...

a major example of this is all the things one can do with a simple CONS cell
(note, my framework makes heavy use of lists and CONS'es as well, and
CONS'es are the only heap-based objects that take only 8 bytes of memory on
x86, or 16 bytes on x86-64...).


We depend upon problem space abstraction make logical indivisibility
flexible. Thus the subject matter and level of abstraction of a subsystem
require that all of the objects that implement it be at a consistent level
of abstraction and be cohesive with respect to the subject matter. (That
is, when we abstract a problem space entity we tailor the abstraction to
the subject matter context.) Similarly, object responsibilities are
required to be consistent with the object's level of abstraction and
subject matter.

This enables the sort of abstraction legerdemain that allows a scalar
object attribute in a high level subsystem expand into dozens of classes
and thousands of objects in a lower level subsystem.


ok.


<aside>
Alas, the 3GL type systems are the root of another problem, physical
coupling (i.e., what the compiler or interpreter must know about the
receiver to write correct machine language for the message sender). The
OOPLs do a good job of logical decoupling but they do a terrible job on
physical decoupling. The result is that to make an OOPL program
maintainable one has to do dependency management refactoring *after* the
problem solution is correct. Such refactoring is completely unnecessary
for OOA/D models. Corollary: doing a rigorous OOA/D model will usually
minimize the amount of dependency management refactoring required during
OOP.
</aside>


this kind of thing can be helped by giving the implementation reflective
ability, or by generating the code to bind things together dynamically...


so, we have an object and a task to perform, and the code for performing
the task looks over the object and spews out a chunk of code for
performing the requested action, which is then invoked whenever the
action in question needs to take place.

and, if the object in question is changed? then the code for performing
the task looks over the object and generates a new piece of code, and
things continue as before.

Shades of Forth! I sure would not want to be the one that has to maintain
such a system if the "task" is dynamic.



errm, yes, certain parts of my system (especially the dynamic compilation
and self-modifying-code parts), have more than a little influence from Forth
and PostScript...

actually, my C compiler is based around a language I call RPNIL, which is
RPN based, and has more than a little in common with PS (it represents the
code being compiled on its way out of the upper-end of the compiler, and
headed down towards the low-level code generator...).

but, alas, a major intended use of the C compiler was to compile code at
runtime, but it is just too slow to really do this effectively, so far more
often I directly generate chunks of assembler and assemble them at runtime,
if anything, because this is much faster...

I once considered a language design I called RPNIL2, which was even more PS
like (would add many more PS features I had left out, and would be turning
complete, ...), but would involve moving much of the code generation into
the language itself (AKA: a huge mass of compiler machinery written in a
PostScript derived language), and so this was never really implemented as
such... (at the time, the idea was to continue making incremental
improvements and refinements to RPNIL, even if it is starting to get a
little hairy...).


but, yeah, usually I like to keep these parts hidden away, and not exposed
to the public APIs...

ok, I have exposed some of these kinds of things via the 'dyll' prefix, but
in this case, the 'll' part means that it is low-level, and potentially
unstable...



.


Quantcast