Re: Object identity




David Barrett-Lennard wrote:
Mark Nicholls wrote:
We're in danger of never ending Lahman like posts....I will do a severe
chop.....

I think there are only really 1 or 2 issues here.

<snip>
As I see it, the moment you map your object-type instance to an entity
outside the computing space you should start to think of your
object-type as *also* behaving like a value-type.

However, I find it rather suspicious for an instance to try to be both
an object-type and a value-type at once. IMO that is precisely where
OO is moving into a domain better handled by more formal, declarative
approaches such as the predicate calculus.

I see no justification for imposing a constraint that an interpretation
mapping be 1:1. The fact of the matter is that external entities like
humans can be modelled in any number of ways. I see no reason at all
why any number of instances in memory can't model the same human. Eg
one instance can model a human as an employee and another as a medical
patient. It is impossible to deny that there is a many to one
relationship between model and what is modelled.

hmmm

no

I deny it. you have described a 1:1 mapping and then claimed it not to
be on the grounds of an unincluded many to 1 mapping from a person (in
different contexts) to the physicality of a single human....your
cheating.

I don't think so. In my original post I talked about a 1:1 mapping in
section 2, which is a definition of a *misconception* about object
identity. I claim that forcing an interpretation to be 1:1 is an
unreasonable constraint.

OK...I disagree...largely...in the real world I think we do drop the
constraint...and impose rules about how the interpretation gets
resolved.....last update wins etc.


I have never said or inferred that an interpretation should be 1:1.
Why do you say I'm cheating?

because you specify

"human as an employee"
"human as as a medical patient"

set up a 1:1 mapping

"one instance can model ....."

according to the spec so far these two specification are completely
independent.....between the lines we construct the model in our
heads....it is 1:1.....and then

"to the physicality of a single human"

ahhhhhh....

you've now changed the spec....you've now created relationships between
two previously independent 'entities'...

you now claim that we have a n:1 model.....but we don't the model
constructed has no notion of the "physicality of a single human"....as
soon as we introduce "one instance can model " the "physicality of a
single human".....we now have a 1:1 mapping.

I agree with what you say and see no contradiction with my own point of
view. You agree with me that there is no problem having many instances
that are (somehow) associated with the one human. Furthermore these
instances have their own identity. I think of these instances as
representing precise mathematical models.

OK...(I still claim the model to be 1:1)


You are looking at the 1:1 mapping between instances and mathematical
models. This is exactly what I mean by the isomorphism of the
computational machine to an abstract computational machine. However I
prefer not to call that an interpretation because it is really about
the low-level formalism of OO. I prefer to use the word
"interpretation" to talk about the relationship between model and what
is modelled, only in those cases where there is no isomorphism, and in
fact there can't be because the entity (or aspect of the entity) that
is modelled is mathematically ill-defined (such as a human).

you persist in the claim the we are modelling 'human's...yet you accept
we are only modelling aspects of humans...i.e. there interaction with
other entities....this can be well defined.

This is
particularly relevant when models are mutative, and we know that the
thing that is modelled will not implicitly synchronise with the model.

You may not like my use of the word "interpretation", and maybe my
terminology is not appropriate.

I've lost my own meaning of it now....I think it was how a sentence
mapped into a model.

I thought an interpretation relates
to how a problem domain expert validates whether the software program
reflects reality, and this is outside of the scope of pure computer
science.

Mine is based in mathematical logic.


IMO strictly speaking a computer program should not use class names
like "Employee". Instead use "EmployeeModel". Interpretation belongs
in the requirements specification, not in the implementation.

interpretation (I think) would be the mapping from the spec
(EmployeeTheory) to an entity in a model...either the real world
Employee or the EmployeeModel.

Ideally
source code should consistently read at the level of the abstract
machine. The audience of the source code is for computer scientists,
not experts in the problem domain!

hmmm...it's a side issue I think...the hardest thing about writing good
software is emersing yourself truly in the problem domain, i.e.
becoming a problem domain expert.


I claim that by contrast there is no need to call a class "StackModel"
instead of just "Stack", because a stack is already a suitable name for
a mathematical entity that takes direct part in the abstract
computational machine. Do you see the difference? Do you agree that
some entities are implicitly part of the abstract machine and some
aren't?

not sure.

I still am worried that you are confusing the abstraction
EmployeeTheory in the spec...with a real employee....who eats, drinks,
sleeps and watches TV.




I demonstrated that forcing an interpretation to be 1:1 gets you into
trouble. For example, objects can't have a clone method.

there is nothing sacred about clone....maybe theoretically clone is
dubious.....maybe our instance don't map to changing entities in the
problem domain...but to entities as time T.....then ti would appear
that we do have multiple instances per target entity....but we don't we
have changed the spec and the model...our entities are now "entities at
time T"....1:1

To me it's as simple as this: If like Lahman you don't distinguish the
identity of model and what is modelled, then a clone function seems
evil. However, if you read the software at the lower level semantic
where instances represent models (and not what is modelled), then
cloning an instance corresponds to merely cloning a model which is
fine.

unless it is excluded by the spec....and if the spec sensibly says
things like, when I take £500 out of my current account...I can't take
it out again...it is excluding clone.


Your proposal that mutable/immutable is behind it all doesn't seem
quite right to me. The reason is that the elements of a computational
machine have identity, state and behaviour. When an instance is said
to be isomorphic to a mathematical entity in an abstract machine, its
ability to behave is a crucial part of the OO formalism. In fact if
anything OO characterises an object by its behaviour rather than its
internal state. This is the ADT way of thinking. For example a stack
is formalised with axiomatic descriptions of the push and pop
functions.

you persist with the view that immutable types don't exhibit
bahaviour...this may be that we have different notions of 'behaviour'
but...

"i++"

has behaviour for me. There is a axiomatic description of '++'......it
has behaviour.




So on what basis does
this 1:1 rule come from? It seems to be plucked out of thin air.

it's to do with mutability of state...it is not plucked from thin air.

if my objects are immutable then there is no problem, each instance
indicates the real world entity in a different state (of time), the
programmer ensures that the sequence of variables he uses that time is
monotonously increasing.....if we roll this up into an 'object' then
we're potentially on dangerous grounds, because how do we know the
object on thread A we are interacting with is the object that
represents the entity now....and that thread B hasn't just withdrawn
£100 from his account....in reality we shove it all in a database and
use the records in the database as the domain of the mapping.

Note for a start that immutable objects can't exhibit behaviour. I'm
sure you agree you can't build systems entirely from immutable objects.

they exhibit behaviour to me.

it depends what's inside and outside the model......it's a good
question....that worries me......can I duck it.


You say an object represents an entity. Do you agree that's rather
vague? What does "represent" mean?

it's formality is defined by the interpretation.

What I mean is that the word "represent" is overloaded. You can say
"that instance represents a stack", or "that instance represents a
human". The first usage relates to the isomorphism to the abstract
machine, the second relates to an interpretation by an expert in the
problem domain.

so?



Well, it can mean different
things! For example, when you mutate the object, does it imply that
the entity changes as well?

it often is the entity.


Consider that the computational machine has been mapped isomorphically
to an abstract computational machine. It's clear that changes to the
instance in memory are associated with changes to the associated entity
in the abstract machine. Now, if you want to treat this isomorphism as
an interpretation, then obviously the interpretation will be 1:1
(because isomorphisms must be bijections). I find all this good
mathematical formalism. You can have pure mathematical definitions of
lists, stacks and threads, characterised formally using axioms. This
approach forms a conceptual basis for proving OO programs correct.
This is what I call the formalism of OO.

Now instead consider an example where there are objects that are
associated with humans. I don't mind calling this an interpretation.
However, I think looking for a formal mathematical isomorphism between
instances and humans is silly. There is no sense in which the
computational machine can be mapped isomorphically onto humans, because
humans have no formal axiomatic definition. Humans aren't entities
that directly take part in the abstract computational machine.

there interaction with computer systems does have formal axiomatic
definition.....when we model 'customer' in a banking system we don't
mean some bloke sitting at home watching television....we mean
something that calls 'withdrawal', 'deposit', 'borrow'....

You're identifying an instance with only an abstract model of a human,
in which case we agree. Note that you actually used the word "model"
in the above paragraph "... when we model 'customer' ...".

the use of model here is the everyday English usage.

I'm identifying it with an abstraction that will be modelled
(implemented) in the real world by a real physical human
being.....he/she will press 'OK'....and take the cash from the cash
machine....if you don't like that then we can simply think of the
physical keys that he presses...and the customer just becomes a stream
of key strokes.

We may agree.



I regard an association between objects and humans as important, but it
exists at a higher level semantic outside the domain of a proper,
well-defined mathematical formalism of OO. By contrast, RM is allowed
to store unit clauses that declare facts about external entities like
humans even though they don't take direct part in an abstract
computational machine. The formalism of OO can't do that.

show me in RM...I am ignorant.

Using Prolog notation, the following unit clause

father(abraham , isaac).

states the fact that Abraham is the father of Isaac. This corresponds
to a record in a table in RM. OO has no equivalent implicit formalism
for stating facts about named entities outside the computing space.

but I construct it.


RM is for storing facts. OO is for (directly) building computational
machines. Apples and oranges!

hmmm...I think this is just a syntactical point.




I consider computer science to simply be a branch of applied
mathematics. From that perspective I am not particularly interested in
an association between humans and objects, even though I agree it
exists and it's important. I don't see it as having any bearing on the
formal underpinnings of OO. I would rather treat OO in a precise
mathematical way. The opponents of OO have a valid argument when they
say OO (currently) lacks a proper formal basis. That is hardly a good
starting point for writing correct software.

What do you need for a formal basis.

Are you asking what's needed, not why it's needed?

For a formal basis we first need some definitions that make threads
like this one unnecessary because we all have an agreed understanding
on what we mean by "entity", "object", "instance", "object-identity",
"value semantics" etc.

OK....I personally believe most of those things would evaporate into
the ether.


Why do we need a formal basis? We need precision in how to interpret
the source code. Eg should you compare values of two Date instances or
just compare their addresses. Why?

thats up to you...this can be formalised....if you choose to compare
values and define it as such...then that is your definition....there
would seem to be no need for more formalism here.


What's the greatest barrier to writing lots and lots of code? I don't
think the bottleneck is how quickly you can type, or how many
applications you can imagine being useful. IMO lack of formalism is a
significant barrier because the moment things get complicated bugs and
limitations start to appear. A lack of formalism is a barrier to
understanding your own code.

possibly...possibly not....a lot of very good programmers seem to
compete with each other on 'nerd' grounds....rather than 'fit for
purpose' grounds.....but I accept that maybe there are specific 'hard'
scenarios that lack some formal understading.


This too used to worry me, and still worries Dmitry....I think a formal
basis is actually pretty trivial....if your a half decent
mathematician....which I'm not.

I still think a full foundandation is relatively trivial.



Consider a car simulation and the fact that one of the car objects may
represent a real car. I consider this fact to be more about the
requirements specification for the program. For example, if you move
the steering wheel to the left on the input controller, you expect the
car in the simulation to turn to the left, not the right. The
requirement that the simulation be (somewhat) faithful to reality is
rather broad and turns into lots of fine-grained requirements.

This would be a very informal spec.

Yes

This
leads into a fairly detailed functional specification of the program.

if at all possible....but then we are modelling the formal 'detailed
functional specification' not the informal "the simulation be
(somewhat) faithful to reality".

Yes.


However when you actually implement the software, the association
between car object and real car is not so relevant. I would say that
they have independent identity. This is clear because a car in a
simulation is not required to be synchronised with the real car on
which it was based.

there are 2 models here...

reality
and the computer model.

<snip> you answer below



identity is synonymous (to me) with instance.....so that is
inconsistent to me.


Furthermore I see know reason to distinguish
instance and object.

neither do I.....and I believe neither does probably 99% of main stream
OO texts.

In other words, by definition an object is an
instance in memory.

yes....and to me identity is synonymous with it.


By contrast, a value-type semantically represents a particular entity
(eg number) under an interpretation. Many instances in memory of a
value-type are allowed to represent the same entity. For example there
can be many ints in memory that simultaneously represent the number
123. In other words it is permissible for the interpretation mapping
to not be 1-1. That is precisely why you can't assume that two
different instances in memory represent different entities. You always
have to compare the state.

OK, ...the thing about the 'number 123' is that it is never anything
else....it's state is fixed (I've recently talked to Mr Kazakov about
exactly this point).......so 'representing' multiple times is fine.

Yes.


BankAccount.Balance = 123....is different....if I represent that
multiple times then I'm in trouble.

If you represent that at all using OO you may be in trouble! You need
a very good reason to favor OO over RM for storing that sort of
information.

?

no I don't, it's only an example, of where allowing N:1 mappings get
you into trouble.

how do I know if I'm overdrawn if the domain of the mapping to my
bankaccount exist in multiple places at the same time.

This is a subtle point. When a computer is the ultimate storage system
for bank balances then *by definition* that attribute is not actually a
model of something else. Rather it is the real deal. In that sense, I
agree it is semantically wrong to store that information more than
once. The strange thing is that a Person object doesn't really
represent a real person in the low-level formalism of OO

I agree it doesn't...so why are you getting so upset that it does.

Huh? What's to be upset about?

why are you making the distinction....see above we doing the same
converstation twice.



, yet an
attribute of the Person object such as the bank balance truly is the
bank balance of the real person.

yes.

but Person is not a real person...it's potential sequence of formal
interactions.

We agree then. Good.

maybe we do agree.



In any case RM seems to offer many benefits for storing information
about entities that are outside the computing space. OO seems inferior
for that particular purpose. I think it's one of the reasons why
OODBMS hasn't been able to displace RDBMS for business applications. I
don't think it's just a question of technology inertia or maturity.

I say this despite being a very strong proponent of OO, and don't
particularly like the idea of issuing SQL queries from an OO program.


The point is about it's volatility of state.

you can have as many non volatile objects in the domain of your
interpretation as you like.

volitile ones must map 1:1.....(or you model must map to a model that
maps 1:1....it's easier just to say 1:1)

You mean mutable when you say volatile don't you?

I do, I tried to remember it, and gave up after 5 minutes of head
scratching and murmering 'string are......', 'string are......',
'string are......', 'string are......',

immutable...!


I think I see what you saying, but I never have this issue in my own
designs because I only use the OO approach in domains where objects are
objects and values are values. This limits the scope of OO, but not
much! I use OO to build computational machines that are decoupled from
things like humans and bank balances.

someone has to!

No. RDBMS can be used to store that information instead.

you seem to be troubled by the nature of the 'problem space'.....it's
just a problem space, it can be integers, gods, dogs or chocolate bars.

In one way I treat them all the same by calling them all entities.
This is particularly useful for RM which can store predicates about
numbers just as easily as humans.

show me an RM storing a human

I say predicates *about* humans. RM doesn't claim to be the real deal.
So no-one has arguments about it!

again I can constuct this in OO.

IPerson GetFather();

is this really any different from labelling a column in a table
'FatherID'.


However, objects in OO implicitly claim to be "real things" in the
sense that they have identity, state and behaviour. That's fine when
they say they're an entity in the computing space like a stack or GUI
button, but not if they pretend to be a human, rather than merely a
model of a human. You agree, don't you?

I do.....I do not make the distinction you do, I simply claim that when
we say 'Customer' it is shorthand for 'ExternalAgent', or
'StreamOfInput'.




In another sense, it seems very important to distinguish the
mathematical things with formal axiomatic definitions that are part of
the computing space. These are needed for a formal proof of
correctness of a program.

Consider the following function

void foo(int a)
{
int c = 2*a;
if (c == 0)
{
assert(a == 0);
}
}

How do we prove that the assertion never fails? A formal proof
requires axioms or theorems about ints. As it turns out, the
assertion can indeed fail, a reminder that abstracting 32 bit ints to Z
carries some dangers.

Anyway what I'm getting at is that ints are part of the computing
space, and axiomatic knowledge of them is required to reason about the
program. By contrast entities like humans are not, and cannot be.

again you seem to consider all aspects of a human to be in scope....I
am mildly confused by this.

What do you mean by aspects? Can you formalise that? Wouldn't you end
up distinguishing between model and what is modelled?

I think we're going over old ground see above.



I think Lahman and Kazakov believe all entities must be treated equally
because they say there doesn't exist any difference (at least that can
be formalised). I don't agree.

I agree with them....for now.

Don't you agree that computer space entities that take part in the
abstract machine are quite different than ill-defined things like
humans that can only be referenced by name?

yes....but 'they' are not in the model...only the reference to them
is...and the semantics of the reference (e.g. name, address, telephone
number) is outside the abstract machine....it doesn't really 'know'
that if I go around and ring the doorbell, Mr Bloggs will answer.

Do you really want to blur
the distinction between a mathematical model of an aspect of a human
and the human itself?

no.......the human is out of scope.





Note that OO is suitable for building a simulation, no matter how
accurately it is based on reality. In that case an object in the
simulation may be associated with a real entity, such as the car driven
by Jack Brabham in 1967. However that association shouldn't be
confused with the formality of an interpretation. The fact of the
matter is that the car in the simulation has independent identity from
the real car it is based on.

? they are two distinct intepretations of 1 theory...the computer model
is *not* a model of reality....(I think my great uncle may have
designed that car you refer to).

computer simulations are a leap of faith.....not logic....they appear
to work....sometimes....conclusions based on an (accurate!) theory of
reality and logically bullet proof (OK there's all sorts of 'ifs and
buts' here.....but it is fundamentally a different beast)

I see no reason (in principle) why a computer simulation can't be very
accurate, at least in principle. It could model down to the linkages
to the accelerator pedal and the burning of the fuel in the cylinders.

Nevertheless a model is by definition, just a model - like you say a
fundamentally different beast from the real thing. That is why I want
to be so careful about the semantics of object identity.

OK, I haven't expressed myself well.

what I was getting at was then just because A model M and B models M
doesn't mean that a proposition true in A will be true in B.

Agreed. There is no formal basis to relate models to each other. A
simple example is where a resistor has been modelled twice and they
provide slightly different values of its resistance.

OK, there are formal basis, rather like subclassing.....but in general
they do not apply.



It is not useful to think of the car in
the simulation as a value-type.

I think you need to define value type.

I mean in the context of the simulation it has identity. However, more
formally see below.


Now if the simulation (or game) happened to have two cars racing around
the track that were both based on Jack Brabham's car from 1967, then
you can certainly argue that there is a problem.

I see no problem.....what theory was being modelled?

The requirements specification may have been violated.

or it may not....what was the spec?....what was defined to be the
entities being modelled?


However, someone
else may say that that's exactly what they wanted to do in the
simulation. It's obviously not going to make the software crash. I
consider this to be a semantic that is outside of the domain of what OO
is about. Trying to push that semantic into the formalism of OO does
more harm than good. It just muddies up the nice distinction between
object and value semantics.

nope....you've lost me, you've probably been through all this with
Dmitry and I'm 5 days behind.

define (loosely) the difference.

Let's use "instance" to talk about things in computer memory. It has
value semantics if, in the OO formalism it represents a mathematically
defined entity (such as a number) that is decoupled from the location
of the instance in memory. There can be many instances in memory at a
given moment that all represent the same entity. For example, there
can be lots of representations of the number 123. Value types can be
copied and compared. The location of the instance in memory is never
relevant to the semantics.

OK....


By contrast, an instance has object semantics if it is a distinct part
of the computational machine. It cannot be decoupled from the instance
(that is located at a particular address).

A bit more formally, when the computational machine is mapped to an
equivalent abstract computational machine, each instance with object
semantics must map 1-1 to an entity in the abstract machine. However,
multiple value type instances can be mapped to the same entity in the
abstract machine.

hmmmm

if you say so....where does this come from?

Me!

wouldn't the semantics of immutable objects be the same as that for
value types?

According to my definition above the answer is no, because there could
be two immutable objects (at different memory locations) that are
regarded as distinct. This may indeed affect the behaviour of the
machine if it performs some object identity tests by comparing
pointers.

ahhhh

OK.

if no such operation exists? then it is a value type?


<snip>

define the difference of value semantics....I'm not convinced that I
'believe' in them as anything other a shortcut.

I'm not sure how to formally deal with the word
"identity". Kosikov suggests identity relates back to a formal mapping
from an object o to its identity id(o),

is it 1:1?

That's the unfortunate thing. It must be, but if you can demonstrate
that you already have a definition of identity so id() is not
necessary! That's why I don't think this idea helps. In the end,
the domain is a set, and distinct elements of a set are by definition
distinct. You don't need to map distinct things to distinct values in
order to tell you that they were distinct in the first place!

I am mildly at a loss to know what identity is actually for.

The concept of identity is important because many OO programs perform
identity comparisons.

OK....but this is just the definition of a relation.....people seem to
get very hung up about '=='......and I never really understand
why......maybe you have explained why.

It is also important in C++ to determine
whether a class should be an object-type or value-type. For example,
it is generally best for complex numbers, strings and dates to have
value semantics.

it is best because?
it is unnecessary to make it an object type.....

I don't use value types generally.... but I do have a awful lot of
immutable types with '==' defined on value....does that make them
'value types'?


IMO a careful analysis of the meaning of object identity suggests that
OO is inferior to RM for representing knowledge about external entities
that can easily be stored using predicates.


I still think this is only syntax....but I may be wrong.

I accept that RM is more effective at storing
relations....unsupprisingly.

.