Re: Object identity
- From: "David Barrett-Lennard" <davidbl@xxxxxxxxxxxx>
- Date: 29 Jun 2006 02:51:08 -0700
H. S. Lahman wrote:
Responding to Barrett-Lennard...
Clarification 2: It is not an interpretation. It is a rule that objects
must have unique identity. It is also a rule that identity, however
it is represented, must be unambiguously traceable: problem space entity
-> object abstraction -> run-time instance. However, that doesn't mean
that identity must be represented the same way for each; it just has to
be unique for each one and traceable between them.
What is not an interpretation? Note that an interpretation, formally
defined as a mathematical function, is the standard way to deal with
the relationship between model and what is modelled. I've seen it in
texts on mathematical logic, automated theorem proving, relational
modelling etc.
Now we also have different definitions of 'interpretation' to add to the
mix. B-) I don't see the application of a rule as being any kind of
interpretation of what to do. IOW, rules are deterministic while
interpretations are not.
Even using your definition of 'interpretation' in Section 2 to be a
mapping function, I think it would be a stretch to use that in this
context. The rules for how one constructs objects and instances
/enable/ a mapping between problem space entities, objects, and object
instances, but I don't think they are the mapping itself.
You are going to have to be more formal for me to comment further. For
example, I don't know what you mean by "rule".
The dictionary definition will do: a fixed principle that determines
conduct. IOW, not a lot of wiggle room.
---- Section 3: Why is it a misconception?
This is demonstrated with the following code
class Employee
{
public:
string GetName() const { return name; }
void SetName(string newName) { name = newName; }
float GetSalary() const { return salary; }
void SetSalary(float newSalary) { salary = newSalary; }
private:
string name;
float salary;
};
void foo()
{
Employee* e = new Employee;
e->SetName("Albert Einstein");
e->SetSalary(25000);
e->SetName("Kurt Godel");
e->SetSalary(29000);
delete e;
}
This is a good example of the difference between object and instance.
What you have is two objects but only one instance.
Hmmm. I don't like that terminology at all. I think there is only one
object (pointed to by e). I don't distinguish the object from the
instance at all. I regard these as perfect synonyms (unless the class
instance is a value type, in which case there is no object at all).
Au contraire. The Einstein and Godel objects are quite explicitly
defined and one can inspect those definitions whether the code is
compiled and executed or not. That a member of the Employee set exists
with identity of "Albert Einstein" and salary of 25000 is quite clear.
And it is equally clear a different object of the Employee set exists
with the identity "Kurt Godel" and salary of 29000.
The humans Einstein and Godel are entities, *not* objects. I think it
is entirely non-standard of you to call them objects (in the context of
a discussion about OO, and in particular object identity).
Yes, the humans that the Einstein and Godel objects abstract are problem
space entities. But objects are abstractions that represent them with
properties like Name and Salary. Those objects are defined in your code
example (and the class definition for Employee).
Problem space:
Conceptual entity: Employee
Concrete entity: Einstein
Concrete entity: Godel
OO software space:
Class: Employee {Name, Salary}
Object: Einstein {Name = Einstein; Salary = 25000}
Object: Godel {Name = Godel; Salary = 29000}
And those objects exist conceptually independently from the instance, 'e'.
That is value semantics, not object semantics. The moment you decouple
the content of an instance from its location in memory, you are using
value semantics. A value-type can be copied around as much as you
like, and it always represents the same underlying entity.
Unlike Java or Smalltalk, C++ lets you create your own value types,
such as a Date class. Instances of a Date class are used like int and
float. There are no objects directly associated with instances of
Date. There are only variables that store date values. A date value
is mapped under an interpretation to the "real" notion of date that
exists independently of the computer.
Your definition of "object" has nothing to do with (pure) OO.
Let's use some consistent terminology and *always* use the word entity
for particular things that exist independently of the computer. Or do
you somehow distinguish between entity and object, even before you
compile and execute the code? If you do please provide a sufficiently
formal definition so I can understand what you mean.
When have I used 'entity' to mean anything except things that exist
independently of the computer? When have I not fully qualified them as
'problem space entity'? In the quoted text I am specifically talking
about /objects/, which are design abstractions that /represent/ problem
space entities. The terminology I am using is standard OO terminology.
You use the standard words but not the standard meaning.
Objects obviously exist outside the execution context because they are
models that the developer provides during design. (In a model-driven
development that will be long before the 3GL code is even written, much
less executed.) That's why I pointed out a couple of messages ago that
there is a difference between 'object' and 'instance'. There are three
levels here:
entity -> object -> instance
which can be paraphrased as:
reality -> abstraction -> execution instantiation
There are only two levels for me : Entity and object. Object is a
synonym for instance.
The trick is that
only one object is instantiated at a time.
Your terminology is inconsistent, in the sense that you say there is
only one instance yet there have been two (object) "instantiations".
The instance, 'e', is identified by its address in memory. Because of
the imperfections of the OOPL in its zeal to provide low level control
over performance optimization, that presents a conundrum because each
instance of the two objects would have the same address identity. The
only way around that is to ensure that only one instance of the two
objects can exist at one time, which the implementation mechanics of
overwriting of memory locations ensures in a simple-minded way without
regard to other issues like relationship management.
IMO you attribute a strong implicit semantic to OO that simply isn't
there in the first place.
Per the above, the terminology I am using is quite standard OO
terminology. I'm afraid it is you who is equating 'object' to the
memory image rather than regarding it as a conceptual abstraction of a
problem space entity.
Yes, we don't agree.
By strong semantic I'm referring to your requirement that there be an
interpretation that is 1-1 (in some scope or sub-system or whatever you
want to call it).
My definition of object is just that : a definition of a word. It
doesn't limit your designs in any way. However your 1-1 interpretation
requirement is a restrictive rule. I claim that such a rule is
harmful. I demonstrated this with actual code. Your defence is that
OOPLs are impure.
I have no problem with an object (ie an instance of a class in memory)
that can have fields changed, making it suddenly represent a different
entity under some interpretation. This is allowed because unlike RM,
OO doesn't itself come ready made with semantics that relates back to
an interpretation. I'm somewhat with Bob Badour when he likens OO to a
methodology for "constructing large unpredictable state machines out of
small predictable state machines". Now I wouldn't go quite that far,
but it does seem clear that the onus is on the programmer to formally
prove that an OO program will solve the problem at hand. Using OO can
be as dangerous as assembly. You have complete, unrestricted access to
a Turing machine. You can express logical, correct solutions as well
as incorrect ones that sound right but require careful analysis to
reveal subtle errors. This promiscuity allows the OO developer to
create wholly new algorithms and techniques. But that power and
generality comes at the price of only low level implicit semantics.
The onus is on someone to prove that /any/ software works correctly. As
it happens, OO software is unique because one can demonstrate
correctness at the design level since well-formed OOA models are
executable in themselves.
You're just talking about design by contract, which is always needed
for imperative programs. Any (well written) C program can also be
proven true at the design level.
I imagine you think that a class diagram always captures most of the
design. I use OO for systems programming, and in that domain a class
diagram is not so useful. The problems I solve are mostly algorithmic
in nature. For example, I'm currently working on a sub-system to
partition objects into spaces that are independently garbage collected,
and support shared read and exclusive lock modes. I'm definitely using
OO, but the class diagram is simple and not very interesting. Most of
the detail is in the locking and garbage collection algorithms. I find
C++ to be a reasonable (but far from ideal) way to directly express the
algorithms. Separate documentation is required to formally prove
correctness of the algorithms.
Another advantage is that one's access to the hardware computational
models is actually severely restricted in OOA/D. For example, one must
resolve functional requirements in an OOA model without /any/ pollution
from the computing space or else the reviewers will get out the
crucifixes and garlic cloves. (As I think I already pointed out in this
thread, one should be able to implement an OOA model unambiguously in
the customer space as a manual system and one can't do that if the
solution is explicitly twiddling the hardware.) That allows one to
separate the concerns of the computing space from those of the problem
space and focus on each individually better than one can do in any other
approach to software development that I know of.
I don't know what you mean. Eg, what is a "hardware computation
model"?
You can't write a general purpose program that can look at a snapshot
of the run time state of any given running OO system, in order to
deduce truths (in the form of predicates about entities), even if it's
provided with the source code, and is also able to find all the global
variables and navigate every thread's frame stack. For a start it is
faced with the problem of finding a consistent cut. It can't hope to
always "understand" some given source code because of the halting
problem. How does it know which objects in memory to trust, and which
not to? Eg is an object just for temporary purposes for some
algorithm? How does it know what the algorithm is for?
But you can't do that (your first sentence) for /any/ program. You can
only demonstrate correctness at run time by either testing it or through
logical reduction from the initial design down to the Assembly to
demonstrate that the design was correctly implemented at each level
under the rules for each refinement in those level transformations. IOW,
you would have to prove that things like the 3GL compiler did the right
thing. There is no way you are going to apply RM or anything else to a
bunch of 1s and 0s in memory to directly determine correctness after
optimizing compilers, linkers, loaders, and the OS have gotten through
chewing on the design.
I wasn't comparing technologies. My point it is that many OO systems
are best described as complex state machines (rather than models of
problem space entities), and this conflicts with your semantics of
object identity.
You can emulate that with the proper tools, like a 3GL source debugger
or a UML model simulator. But you are depending on the tools to provide
the necessary mapping of the 1s and 0s produced to the high level
constructs you are looking at in the tool. IOW, the tool provides the
logical reduction from the user view to the machine view. So if the
code generator or compiler screwed up and didn't follow the
transformation rules the tool depends upon, that will be manifested in
the debugger/simulator as inexplicable results. Been there; done that.
So?
Thinking of OO merely terms of simple class diagrams and modelling of
relationships is at best an over-simplification. More to the point,
that limited view emphasises exactly what OO is poor at :
classification and storing relationships about entities.
First, where do you get the idea that this is all I see in OO? One
defines /solutions/ and those solutions can be validated at the UML
model for functional requirements.
Complex algorithms often can't be validated merely with a UML model.
Eg, an algorithm may require proof by induction. How do you prove that
a multithreaded design can't dead-lock? UML is certainly useful, but
hardly the be all and end all.
However, the second sentence really boggles my mind. Relationships are
crucial to OO development. Among other things, they are critical to
managing access to data. OO relationships are implemented at the object
level rather than the class (table) level in order to restrict access to
data. OO relationships are also crucial tailoring solutions to the
problem in hand so one usually has much better performance than one
would have if restricted to the RDB view of relationships. I could
probably do several more paragraphs on how unique and important OO
relationships are to the paradigm.
OO is good at classifying and storing relationships about objects
(using my definition of object - which is directly associated with
instances). OO is relatively poor at classifying and storing
relationships about entities - as I said above.
You don't draw the distinction even though it exists and is easy to
define. I don't even know where instances feature in your formalism.
You think of objects as values that implicitly represent entities.
That may be true when OO is (mis)used in business applications. It is
not true in problem domains where OO works really well.
In systems programming you deal with things like queues, stacks, maps,
sets, threads, mutexes, semaphores, caches, smart pointers. A design
is expressed in terms of these building blocks. When I push an
element onto a queue, I know the queue instance has changed, not some
(external) entity. Identity is associated with the instances in
memory. You say the instances in memory are just abstractions for
mathematical notions of stack etc, and don't feel any need to associate
identity with the instance in memory. How can you possibly validate
the source code? More specifically, the imperatively expressed
algorithms.
In Business applications there are lots of entities that exist
independently of the computer. It doesn't surprise me that OO is good
for systems programming and not so good for business applications.
Bottom line: if there is one single thing that would give OO an
advantage over other approaches to software development (where it is
applicable), it would be that way the OO paradigm deals with
relationships. It is a far more versatile approach than the RDB-style RDM.
You say RM is inferior to OO for storing relationships, even though R
stands for "relational" in RM. That's a bold statement! I think it's
too generic and lacks any substance. I neither agree nor disagree.
The above "imperfections" of the OOPL don't exist at all, because OO
works at a lower level semantic than you prescribe.
Wow. We are getting further and further apart with each statement. I
have already pointed out that OOPLs are a poor place to understand the
OO paradigm because they are already at too low a level of semantic
abstraction because they use things like stack-based scope and
procedural block structuring. Those compromises with hardware
computational models trash fundamental OO notions like separation of
message and method so that they don't even exist in 3GL type systems.
Yes, we have a fundamental difference of opinion.
While technically ensuring unique identity for each object instance in
the language implementation, that mechanism opens up a host of
referential integrity problems that are pushed off onto the developer.
That's why the abstract action languages for OOA/D don't allow that to
happen; instance creation is a fundamental operation and the instance
identity mechanism is not exposed to the developer.
I don't agree. Just find a Clone() method on a class in an OOD.
None of the AALs I know of for OOA/D will allow you to do that. They
all treat object instantiation as a fundamental operation where one must
fully initialize the instance as part of the creation. That operation
will always produce a unique instance (i.e., one can't "reuse" a memory
instance as you can in C++). The model simulators and debuggers will
also flag duplicate identity attributes as an error. [If an object is
identified by an embedded attribute, then that attribute must be
designated as such in the Class Model, just like in a Data Model. So
the model debugger will detect the duplicate, just like a DBMS would in
a properly normalized RDB.]
IMO you are talking about some small off shoot of OO used specifically
to model entities. I think this is a misuse of OO. For you it is
what OO is all about!
And the reviewers would burn you at the stake even if you could. B-)
They would do that for the same reasons a DBA would go apoplectic if you
tried to turn on the option of duplicate primary keys in a table. [I've
never understood why a lot of RAD IDEs allow that. I belong to the
school that believes tools should enforce good practices rather than
inviting bad ones.]
[In part the OOPL is confusing things by allowing the object to be
instantiated without proper initialization of identity. (One could not
do that in any of the abstract action languages used for OOA/D that I
know of.) In part the OOPL is confusing things by allowing an
optimization by reusing the memory for the object without the overhead
of heap operations. IOW, the OOPL designer is offering the developer an
opportunity for foot-shooting by washing his hands of referential
integrity issues and pushing them all on the developer. This is why I
argued that looking at OOPL code is not a good place to learn OO.]
IMO the only reason the OOPL is confusing things is because you want to
deal with object identity at the level of the entities in the problem
space. This problem disappears if you simply associate objects (and
their identity) with nothing other than the class instances that reside
in memory. In my mind this is a case of "less is more".
The problem is that objects, unlike RDB tuples, usually do not have
explicit identity. Instead it is often defined referentially, which
maps conveniently to a memory address in hardware so the OOPLs provide
infrastructures around that paradigm.
We don't share the same definition of "object". For you it is an
entity (I think). For me it is an instance in memory.
Right. This is becoming clear. However, in the OO paradigm an object
is a uniquely identified abstraction, not an instance in memory. One
uses 'instance' for the memory image of the object.
Can you find a reference that states that?
However, when the the objects do have explicit identity -- as in your
Einstein/Godel example -- there is a problem. That's because the OOPL's
provide no infrastructure for identity attributes. Unlike designated
keys in an RDB table, there is nothing special about such attributes.
That makes it perfectly legal to change the name "Albert Einstein"
(25000) to "Kurt Godel" (29000) in the Einstein instance within the OOPL
syntax rules. IOW, explicit identity is purely in the mind of the
developer and all we can do is break the thumbs of developers who do
things like changing object identity attributes on the fly.
Thus the OOPLs fail to support object identity mapping fully. However,
to avoid referential integrity chaos, the developers /must/
methodologically know what object identity is at OOA/D time and treat it
with the respect it deserves. (If they get it right in the OOA/D, then
it doesn't matter what sort of foot-shooting the OOPL in hand allows.)
IMO if the code expressed in the OOPL doesn't map simply to the OOA/D
then OO is being misused. Don't confuse ER diagram and class diagram.
I agree. The problem is that the OOPLs (especially poorly formed ones
like C++) let you do things that OOA/D would never allow you to do.
Again, that's why I keep contending that looking at OOPLs is not the
best way to understand OO development.
<aside>
OOA/D has evolved a whole lot since Kay & Co came up with Smalltalk. It
still amazes me how close they came to Getting It Right the first time
out of the box. Smalltalk is still one of the best OOPLs to use as a
novice because it makes fewer compromises. (More precisely, it hides
them better.)
</aside>
That is clearly in conflict with the "misconception" that was defined
in section 2. I am of course assuming that the interpretation map
makes use of the 'name' member of an Employee instance. Under this
interpretation, the definition of object identity changes from one call
to the next in function foo() above.
This poor definition of object identity makes it impossible for an
object to provide a Clone() method, such as
Employee* Employee::Clone() const
{
Employee* e = new Employee;
e->name = name;
e->salary = salary;
return e;
}
The reason is that under the interpretation the clone would map to the
same entity (in the problem space), in conflict with the assumption
that the interpretation is 1-1.
This is a different problem. A true clone function creates different
instances with unique identity (address) but the instances all map to
the same object abstraction Employee{name,...}. This creates an even
nastier set of problems for referential integrity for relationships when
one changes the salary. The solution here is to break the thumbs of any
developer who does something like this.
So you say the clone function() is valid, but a developer who calls it
would be wise to first get insurance on his/her thumbs? :)
Exactly. It is valid at the 3GL level because of imperfections in the
OOPLs, but it is not valid at the OOA/D level.
All I can say is YUK
Why? What need is there for clones? Give me a plausible example
Ok. A user creates a CAD drawing. This uses the composite pattern for
grouping the basic graphics elements. Therefore a drawing has a tree
structure. Semantically a node object (= instance) owns all its
descendent graphic elements. A node object isn't semantically bound to
some particular external entity. Therefore it is semantically valid to
clone any node (which implicitly means a whole sub-tree of the drawing
is cloned). The clone takes on an independent identity, and can be
inserted somewhere else in the tree. This is needed to implement copy
and paste - quite a useful thing in a drawing program, particularly
when the clone will then be modified to suit a specific requirement
needed by the user.
from
some problem space where one would have use for clones (i.e., problem
space entities exist without unique identity). Or an example of a
design where one needs to have multiple instances with the same identity
based on a single entity.
It seems you assume OO run time state *always* binds (under
interpretation) to specific entities in some problem space. That is
simply not true in general.
Ok, as promised above I will now state how I believe object identity
works. This is a mixture of formal and informal (because I'm too
lazy). I think this is the prevailing view amongst the majority of OO
programmers...
It really is just a set of definitions...
Definition: An *entity* is a thing that can be modelled by a computer.
An entity generally exists independently of the computer that models
it. An example of an entity is a human. Another example is the
integer 2189.
OK, but I really don't like the last sentence. Fundamental units of
computation really shouldn't be regarded as equivalent to problem space
abstractions, however convenient that might be to 3GL design.
Sure,
there is some mathematical concept behind the notion of Integer Number
that is abstracted, but that notion is really only of interest to
implementing software on a hardware computer.
Yes, but that is after all the topic of this discussion.
Then we have a big disconnect. I have been talking about solving
problems using an OO methodology, not designing tools.
Note that your sentence betrays an aversion to wanting to treat numbers
as real in any sense whatsoever. You use words like "concept",
"notion", "abstracted". I suppose you say most numbers don't exist
because no one has written them down. I on the other hand am a
Platonist and don't lose any sleep over this, or force myself to
pollute my sentences with lots of additional but meaningless words to
indicate that numbers aren't real.
I have no problem with numbers as a concept. I don't even have a
problem with them being treated as objects in the implementation of an
OOPL. What I have a problem with is making them first class objects
having equal stature to those that abstract problem space entities.
Saying a number is an object just seems absurd to me. Objects by
definition have identity, state and behaviour and are associated with
instances in memory.
Note that RM happily stores relations about humans or numbers in the
same database. They are all just "named" entities. Eg < is a
relation on numbers. PROLOG makes it clear that you treat < as just
another predicate, as if you had listed all the unit clauses defining
the relation yourself.
I find this to be a simple and elegant way to think.
One only needs a notion of 'number' for OOA/D that is an abstract scalar
value of some object /attribute/. And often one will use an ADT to
provide a semantics for that number that is quite different than the
mathematical concept. It is only in the OOPLs that one needs to provide
an explicit storage type for it that maps to the hardware's notion of
'number'. But even at that level, it is still primarily just a value
for an attribute and the programmer usually doesn't have to think of it
as a unique location in memory with particular alignment, endian, etc.
So I am more of a Freudian: sometimes a cigar is just a cigar and
sometimes a number is just an attribute value.
IOW, once one is out of
the realms of computers or pure mathematics, the notion of Integer as an
entity is a pretty alien concept.
I don't disagree
There 2189 is just a value of some
bit of knowledge.
No. Outside the realm of mathematics, 2189 doesn't even exist, and
neither does computer science.
Try telling that to a customer who is looking at their bank balance.
B-) Numeric values do exist ubiquitously in most customer problem
spaces. That's why computers are ubiquitous and knowledge is one of the
two ways one can abstract problem space entities in the OO paradigm.
Commenting on this just seems rather pointless. Do you consider
computer science to be a branch of applied mathematics?
Thus entities can be abstracted with knowledge but
the bits of knowledge aren't are the same level of abstraction.
Huh?
An object abstracts a problem space entity. Part of that abstraction
are bits of knowledge that the entity is responsible for knowing. But
those bits of knowledge are properties of the entity, not the entity
itself. My point above about numbers is that all first class objects
should abstract entities, not properties.
I find the lack of formalism in what you say excruciating.
Definition: A *problem space* is a (mathematical) set of entities that
are relevant to solving a given problem using a computer. Entities in
a single problem space are allowed to form has-a relationships. For
example, Albert Einstein is an entity, and Albert Einstein's left
eyeball is an entity as well. Entities in a single problem space are
allowed to be at different "levels of abstraction". Basically there
are no restrictions!
Relative to your first definition, I think there are restrictions. For
example, entities must have abstractable properties that are relevant to
the solution. I would go further and argue that entities should have
multiple properties. (Only one may be relevant to the problem in hand,
but then a reviewer should want a lot of justification that was so.)
Remember that the OO paradigm provides for three fundamental levels of
abstraction: subsystem; object; and responsibility. (Nesting, as in
subsystems, is allowed and objects can embed other objects, but
mechanisms like implementation hiding mitigate that.) It is subsystems
and objects that map to problem space entities. By implication problem
space entities are necessarily complex to support further subdivision
into properties.
Entities exist independently of the computer. So I don't see what any
of the above has to do with it.
But subsystems and objects are solution constructs. The scale of the
construct is important. Subsystems and objects abstract problem space
entities. As such they are first class objects and they have unique
identity. However, numbers are only relevant as values of knowledge
properties. So they don't need to be first class objects with unique
identity. Their identity is already expressed by the identity of the
property and identity of the owning object.
YUK. Are we even in the same profession?
Definition: An *entity-type* is a set of entities used for
classification purposes. For example, the set of all humans, or the
set of integers Z. It is permissible for an entity-type to be
regarded as an entity.
Definition: A *class* is associated with a class definition written in
some OOPL that defines both state and methods. For our purposes we
are only interested in concrete classes.
Let's keep it to OOA/D sets rather that the baggage of 3GL type systems.
Entities belong to sets (aka classes). Set membership is based on all
members having the same property set (defined by the class).
No. Please don't change my definitions. Classes are not sets of
entities.
I am just stating what the OO paradigm is. A OO class is a set of
identified objects in OOA/D. That maps into a type in OOP but
sacrifices a lot in doing so (e.g., separation of message and method).
So it would be less distracting to talk about the OOA/D view. We are
already far too hung up on notions like 'instance' and cloning addresses
in poorly formed OOPLs.
Definition: A *process* is associated with a particular execution of
the program on a given computer. A process has its own address space.
Not relevant. One of the tests of a well-formed OOA model is that it
can be unambiguously implemented as a manual system in the customer's
environment (however inefficient that might be). [Obviously computing
space applications like DBMS engines are an exception.] Thus any notion
of object identity has to be independent of the computing environment.
(Instance identity, OTOH, is dependent on the implementation environment.)
However, I would buy a similar definition related to scope _within an
application_. In OOA/D that scope is a subsystem.
It is relevant because "process" appears in the definition of class
instance. You want to abstract the run time system away. I do not.
You are interested in a higher level semantic. I am not.
Definition: A *class-instance* is a particular instantiation of a
(concrete) class in a particular process at a particular location in
memory. For our purposes we don't care whether the instance is
associated with a global variable, a frame variable or was allocated on
the heap.
I don't like the term "class instance". An object instance is
implemented in memory, not a class. The class is instantiated within
some scope by defining a set of identified object members. Relative to
the point above, that class instantiation clearly does not have to be in
memory; it is just a set of identified objects.
Good point. But I wanted to distinguish between instances that act as
values versus objects. So I'll just say "instance" instead.
I am concerned about that distinction being necessary or desirable.
Surely not.
The
only things that have values in an OO context are object properties (and
object identity when it is not an explicit attribute). But object
properties do not have unique identity; their identity is attached to
the object. So there is no separable instance of a value unless one is
talking about computer hardware implementation, which is only relevant
if one is implementing an OOPL (e.g., computing offsets from the
object's address).
Why should a value-type be limited to a member of an object?
Definition: A *concrete-type* is either a concrete class or a simple
type like int, float.
Definition: A *value-type* is a concrete type used for program
variables, that represents an entity in the problem space. This
involves some given interpretation (ie map) from its state (ie bytes in
memory) to entity. When a value-type variable is copied both the
original value and the new value will represent the same entity.
Examples of value-types are int, float, as well as certain classes like
std::string (the STL string for C++). It is generally wrong to take
the address of a value-type variable for the purposes of pointer
comparisons.
Definition: At any given point in time a value-type variable is said
to contain a particular *value*. A value is associated with the entity
that the variable represents (under the interpretation as a
value-type).
I really don't like introducing type systems. They are a 3GL
implementation mechanism for OOA/D class systems. As such it becomes
difficult to separate the local implementation issues from the
fundamental OOA/D issues. The OOA/D set-oriented view is much simpler,
more general, and less dependent on the vagaries of implementation.
I want to talk about the semantics of object identity with respect to
some given source code, not some UML diagram or whatever, particularly
if it is abstracted away from the source code.
I don't think the notation matters. An object exists as soon as the
developer thinks of about a set of problem space entities and abstracts
their properties for the problem in hand. The object identity is only
relevant to (mappable to) the problem space entity.
At run time the object will be instantiated somehow as a uniquely
identified instance of the object over some period of time dictated by
the problem solution. It doesn't matter what the implementation
mechanism is; it just has to honor the notion of one object == one
instance and provide an unambiguous mapping between the object and instance.
However, as soon as you start introducing type systems you are talking
about a specific OOPL context and that presents an opportunity for
getting side tracked into the vagaries of poorly formed OOPLs that do
things like allowing instances of different objects have the same
identity (address).
[BTW, the Action Semantics documentation for UML AALs goes to great
lengths to describe the execution model of a program because that is
necessary to map the UML dynamic descriptions unambiguously. Alas, it
is a remarkably obtuse document.
Inevitably, I would say.
Nonetheless, those mappings of the
execution model (instance) to the OOA/D model (objects) is pretty much
what I have been describing. So if you want to check whether I am
making all this up as I go along, that's the most authoritative source.]
It just sounds like a dog's breakfast to me.
Definition: An *object* is a class-instance that is regarded as having
identity tied to that class-instance. For the purposes of identity,
the object does *not* represent an entity under some interpretation.
Although it certainly may (in the mind of the programmer) that is
entirely irrelevant to the semantics of object indentity.
OK, here is where we part company in a big way.
One can argue that any formalism tends to be stilted and that the OO
paradigm must have an underlying basis in mathematics where unique
definitions of things like /value/ prevail. However, I leave that to
the boffins who design OO methodologies, OOA/D notations, and OOPLs.
That has all been resolved at the level of solving problems using an OO
appoach and the OO paradigm has methodological constraints on solution
construction...
I don't think you're being honest to *actual* OO (ie actual source
code). You hide behind abstractions of the source code like UML class
diagrams.
Of course. You can abuse any paradigm. To use my favorite paraphrase
of G. B. Shaw on Christianity, the only trouble with OO is that it
hasn't been tried. Most supposedly "OO" software today is really just
FORTRAN and C programs with strong typing. But I am assuming that one
wants to construct software in a methodologically correct manner. The
methodologies allow one to construct well-formed OO software without a
lot of angst over the underlying mathematics. Those OO methodologies
lead to the following quoted point because they /define/ the constraint
to ensure the underlying mathematics is not abused.
But end up making a mathematical mess.
Objects abstract problem space entities and they must have unique
identity that is unambiguously traceable to that of the problem space
entity that they abstract.
I think that's just an unnecessary limitation, revealed in the enormous
amounts of real code that don't follow that rule at all - such as a
Clone() method.
And they break all the time because of referential integrity problems
that they wouldn't have had if they played by the methodological rules.
I don't think that's where OO comes off the rails at all. The biggest
problem is using it in the first place to solve a problem that it
wasn't intended for.
I am not making this up; it is fundamental
to the OO paradigm because it enables an explicit mapping between the
problem space and the software solution that was missing in previous
approaches.
I don't know why you think that. I see plenty of disadvantages. If
that is fundamental to OO, then OO is living a lie.
Name another software development paradigm that:
-- provides more clearly defined step stones between the problem space
(requirements analysis), high level design (OOA; functional
requirements), low level design (OOD; nonfunctional requirements), and
tactical design (OOP; technologies and platforms);
-- systematically bases software structure on problem space structure;
-- where high level design can be directly validated through testing;
-- that provides solutions that are more robust and maintainable in the
face of volatile requirements.
Relevance? Remember I don't think OO is bad. I just think it is
occasionally misused.
Those things are only achievable if one has an explicit mapping to
problem space structure. However, that mapping depends on carrying
identity through entity -> object -> instance in a systematic and
unambiguous (read 1:1) manner.
I disagree.
You can postulate an A&D system where that is not
necessarily true and where other mappings to the problem space are
provided (P/R and FP being obvious examples), but it would not be an OO
methodology.
My definitions are not at odds with OO, as long as you don't blur the
distinction between instance and entity. Most software using OO
doesn't blur the distinction.
I think the blurring lies in essentially eliminating the OO view of
'object' by trying to go directly from problem space entity to memory
instance. The gulf between problem space and computing space is just
too broad to do that reliably. So the OO notion of an object as an
abstraction between problem space entity and memory instance is an
important step stone in a conceptual chain of representation transformation.
I was hoping you would accept the definitions for what they are (merely
definitions), and use that as a basis for properly understanding my
perspective, and then comment on whether there are logical
inconsistencies with what I'm saying, or else tangible disadvantages,
preferably using an example to illustrate your point. Instead you
have only given me general, hand waving arguments that mostly only
repeat your own alternative definitions. I made some specific points
about why my definitions are better than yours and you didn't comment
on them.
Cheers,
David Barrett-Lennard
.
- Follow-Ups:
- Re: Object identity
- From: H. S. Lahman
- Re: Object identity
- References:
- Object identity
- From: David Barrett-Lennard
- Re: Object identity
- From: H. S. Lahman
- Re: Object identity
- From: David Barrett-Lennard
- Re: Object identity
- From: H. S. Lahman
- Re: Object identity
- From: David Barrett-Lennard
- Re: Object identity
- From: H. S. Lahman
- Object identity
- Prev by Date: Re: Object identity
- Next by Date: MVP with complex controls
- Previous by thread: Re: Object identity
- Next by thread: Re: Object identity
- Index(es):
Relevant Pages
|