Re: object system...
- From: "cr88192" <cr88192@xxxxxxxxxxx>
- Date: Fri, 9 Jan 2009 17:13:29 +1000
"Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx> wrote in message
news:1i10n69z8xcdq$.16kkhaz0ituqy$.dlg@xxxxxxxxxxxxx
On Fri, 9 Jan 2009 00:01:09 +1000, cr88192 wrote:
"Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx> wrote in message
news:1svcuf8jnljq$.1g6jq9vst71p2.dlg@xxxxxxxxxxxxx
On Thu, 8 Jan 2009 20:35:22 +1000, cr88192 wrote:
"Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx> wrote in message
news:1iezboh1oejdf.x4om15uvsv7j.dlg@xxxxxxxxxxxxx
On Tue, 6 Jan 2009 01:08:54 +1000, cr88192 wrote:
it is possible to modify class layout at runtime, and the object
system
machinery will work to keep everything consistent (internally, a
versioning system is used, so an older object may be layed out
according to an older
version of the class), however doing so will reduce the performance
of
the class(es) in question;
Do you mean representation in the memory? Why do you want to modify
it?
It does not look like a workable idea when considering packed
containers
etc. For instance an array of Booleans.
an example of a situation in which the layout of a class would be
modified would be, for example, loading a new version of the class from
a file, or
rebuilding an existing module.
New version of a class is another class. But you are already untyped, so
who cares.
not exactly...
the new version keeps the same identity as the originan, and
non-structural
state (such as static variables and overloaded methods) is shared between
the old and new versions...
= mutating objects. A bad idea, IMO.
it also creates a bad case of slowness, as it causes slot and method access
to resort to a much slower algo (limited to instances and subclasses of the
modified class), however it can pull off some things which are otherwise
problematic.
say, the user has been working on a piece of code, and then their
changes
have been committed, and these changes happen to include revising the
layout
(such as by adding new slots or methods, ...).
... removing slots, methods etc. Why old instances should work with new
implementations of new class?
there are reasons for needing to keep class identity.
None in my eyes. Class identifies itself and the types in it.
consider a VM, say, the Java VM, has certain classes, say,
'java.lang.String', which need to exist prior to actually loading the
class
file in which the class is defined.
so, early on the VM, a simple fake version of the class is implemented,
and
once the actual version is loaded, it patches over the original. Now,
suddendly, all of the old instances become fully upgraded to their new
standing as an actual class with actual methods...
which potentially breaks all contracts. The stuff need to be recompiled,
end of story.
this assumes that a linking takes place between the objects' physical layout
and the code compiled to use it. if this does not happen (for example, slots
are always accessed via a handle and an accessor), then even changing object
layout need not break the client code (I have designed it to work even in
the multithreaded case...).
so, versioning is used such that the different layouts can coexist, while
still sharing everything else (class identity, slot and method handles,
...). externally, it doesn't look like much of anything has changed...
It is not IMO the way to do such thing. A proper way could be to make
types
first class citizen. Then you would have type values and objects (and meta
types of). These objects would have states = versions in your terminology.
Anyway I doubt that types can be versioned independently. I guess that
this
stuff must apply to larger entities like compilation units (packages etc),
rather individual types.
in terms of the API, classes are first-class objects...
however, a versioned class still retains the same identity (AKA: the same
class pointer, ...).
the given class then "remembers" all of its previous versions (size, memory
layout, ...), and also its newest version, superclass versions, ... version
info is then kept on a per-instance basis, and so concievably, each instance
of a given class could have a different version and a different physical
layout...
now, as noted, different algos are used in different cases.
namely, if the object has never been modified, then the slots are linked
directly (AKA: the slot directly encodes its physical offset in the
instance);
however, once the layout is changed, then the object switches all its slots
over to "indirect" mode, which then causes accesses to locate info about the
current layout of the current object for the class of the slot in question
(AKA: it walks the inheritence heirarchy checking version numbers, ...), and
then uses a table to figure out the slot's offset in this instance.
so, yes, it is important for performance reasons to avoid class modification
whenever possible.
(however, all of this machinery is hidden behind the API, and so is solely
the property of the object system machinery...).
The problem exists only because you go untyped. If your system where
typed, there would be no problem, because competing implementations of
the same
interface could coexist.
not sure why this would be...
In a typed system you can separate implementation from the interface
(contracts). So long the contracts are stable, implementations can vary.
ok, yes.
the use of interfaces is also a way to avoid some issues, since an interface
can bind to many different classes.
attempting to make use of newer slots or similar may cause the object
itself to be re-packaged using the new layout (and, thus, brought up to
date
with its class).
In a typed system old object would substitutable in the methods of its
subtype without further actions. Changing representation, if necessary
would happen transparently upon type conversions (forward and backward).
well, at this level, the class is bound physically to the layout.
changing the object class thus involves also a change in structure, and
very
likely, a change in object identity.
the result then is not the old object with a new type, but a new object
with
some of the old state...
Yes.
it is possible to add slots to individual instances of an object,
allowing the emulation of some aspects of Prototype-OO style
What for?
Prototype OO has some capabilities that Class/Instance OO does not...
C/I is also unable to effectively facilitate some non-C/I languages,
such
as JS, ...
Example?
dynamic delegation, ability to use an object the same as a dictionary
absent
needing some separate kind of collection object, ability to dynamically
add
slots and methods to an object, ...
These are means, but what is the goal?
supporting certain kinds of programming languages and formalisms that would
otherwise go unsupported...
also, it has the ability to implement a language like JavaScript without
introducing a semantic barrier (which is what would happen if
implementing
these features on top of a C/I system, or as a separate object system).
I am not sure that languages like JavaScript should be implemented.
Actually, I am sure they better should not.... (:-))
none the less, JS is probably not going away, and the ability to support it
effectively (and languages like it, such as Self, Ruby, ...) is likely an
asset to a VM framework (after all, both Sun and MS have put money into
adding these kinds of languages to the JVM and .NET...).
(as well as the ability to "clone" objects, ...),
You mean a polymorphic copy? Where is a problem? Except that one must
separate classes and types in order to be able to do so. Does your
system
separate them?
a clone of the object has the same type and state as the original, but
a
separate object identity.
That is called assignment. x := y; BTW, not all types should have
assignment.
it is called cloning in Prototype-OO terminology...
I prefer the old term.
but, this may be because there is a difference in the notion as to what
is
an object (in P-OO, an object is never considered to be a value type,
which
is what is required for assignment to be considered a means of
duplicating
an object).
That's because they fail to separate types and classes. In a language
where
they are separated you can copy a polymorphic object using assignment,
(because there is a distinct type for).
ok.
there is also support for object delegation (like in Self or
JavaScript),
but this feature can at present only be used via interfaces (this is
mostly for technical reasons);
I see, delegation when there are interfaces to members looks
technically
easy. The problem is syntax sugar for.
I am not sure I know what you mean.
How to express in the language a wish to delegate, A of X to B of Y.
I have not specified a language-level manifestation of all this, however,
were I to do so, it would probably be via special keywords...
That's the most interesting part, because an implementation is trivial.
ok.
In my view interface as a separate concept brings only disadvantages.
There should be only types and classes of.
I am not sure I see the reason for this view.
Because it is inconsistent and extremely inefficient to mix them. An
example is C++, which re-dispatches all the time and keeps type tags in
all objects.
C++ does not have interfaces, so I don't see the problem...
C++ does not need interfaces because it has abstract types, which is a
more
general concept than interface. The problem is in inconsistency and
inefficiency.
then again, in C++, MI is often used to similar effect as having
interfaces...
The inverse. Interface is a poor-man's MI.
it depends on how one looks at it, interfaces, as they exist in the JVM and
CLI, are a rather different concept than MI, with various strengths and
weaknesses, ...
for example, there are things that can be done with one and not the other,
and also the reverse...
however, interfaces do avoid some of the complexities of MI.
Huh, interfaces have all [alleged] problems of MI.
not quite...
since an interface doesn't actually implement anything, there is no real
need to worry about clashes or lookup order (since 2 methods which match
will simply merge). as a result, one major class of problems that exists in
MI (having to identify which of the superclasses a given method came from,
....) simply dissapears.
performance could be better, but it is still "acceptable"...
What I expect from a good system is zero performance hit when no
dispatch
happens. This is why type/class separation is so important to me. If
you
deal with only types, which is about 80-90% of all calls, you do not
need
to dispatch and so there is no any overhead (the target is statically
known).
Another requirement is that dispatch never fails, i.e. classes are
statically checkable.
Yet another is hard read-time constraints of call overhead.
if you mean built-in value types, yes, I have these...
No.
actually, given how everything exists, they are essentially unavoidable
anyways...
I wish same performance for all kind of types. It is unacceptable when
user-defined types inflict some performance loss. There is something
conceptually wrong with the types system then.
errm, these issues are usually intrinsic.
user defined types are tpically slower, if anything, because they are not
natively implemented on the processor.
So are the predefined types of higher-level language. I don't count broken
assembler named C.
C is often, and has been so traditionally, regarded as an HLL...
there is actually a good deal of work that goes between the C parser and
producing the ASM output...
so, for example, 128 bit integers will be slower than 32 bit integers,
and
within reason, this much is unlikely to change...
No, they are as slow as your best possible implementation based on 32-bit
instructions available. This is my requirement to the type system. I don't
require to rewire the CPU.
ok.
well, yes, in my compilers and such, 128 bit ints and similar are built in
types...
many operations should be fast, but beware of multiplication and especially
division (128 bit division is not exactly what we would call... fast... so,
be warned, as internally the operation is implemented as a loop which
performs something called long division...).
be more warned though about the 128 bit floats...
note that a major central structure in the object system is a few
large
hash tables, which are the primary means by which things like
interface
lookups, ... are performed (hashes are computed using the classname,
slot/method name, and method signature, ...).
Performance of hash tables is less predictable, you cannot actually
tell
how long it would takes to call a method if that goes trough a hash
table. But in the first place I don't see why it should. IMO this
stuff should
not leak beyond the compile time.
actually, the whole system exists at runtime, because the entire
framework makes heavy use of dynamic compilation and JIT for
everything...
Is this existence required? Compare, it is required presence of the
programmer, while the program is running... You cannot store the
programmer on the flash, it has to be payed, feed, needs some time to
sleep.
well, otherwise I need to resort to other means to accomplish the same
tasks...
hash tables are a good and generally reliable mechanism IME, and are much
preferable to other options (such as linear or binary searches, ...).
My point is that I prefer linear search at compile time to a hash table at
run-time.
in my case, both compiletime and runtime are sort of fused together...
even then, some operations will invariably require either a lookup or
hash-table based dispatch at runtime (I just chose the one I like more of
the two...)
but, anyways, the performance of the hash-tables is strictly bound:
if any lookup takes more than a certain number of steps, then the whole
table is expanded and all of the items are re-hashed...
= unbound (re-hashing requires time, lot of time)
but, it is rare enough to be safely ignored (and my initial hash tables
are
fairly large...). more so, usually rehashing is a fairly quick operation
(not something you would want running all the time, but not really
expensive
enough to be that much worth worrying about, given its relative
rarity...).
This is still unbound. Technically you are talking about the mean time.
it is fast enough, since probably only in very rare cases, if ever, will
this actually need to happen... (one is far more likely to run into the
garbage collector than a rehash, and the GC is likely to be far more
expensive, as it may grind through 10s or 100s of MB of data, rather than
re-inserting maybe 256k or 1M items into a new hash table...).
anyways, the hash table should only really matter anyways during interface
calls (or when looking up methods by name, such as in a name-based
dispatch), as plain virtual method calls have no real need to mess with it
(they can find the correct method on their own...).
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
.
- Follow-Ups:
- Re: object system...
- From: Dmitry A. Kazakov
- Re: object system...
- References:
- object system...
- From: cr88192
- Re: object system...
- From: Dmitry A. Kazakov
- Re: object system...
- From: cr88192
- Re: object system...
- From: Dmitry A. Kazakov
- Re: object system...
- From: cr88192
- Re: object system...
- From: Dmitry A. Kazakov
- object system...
- Prev by Date: Re: object system...
- Next by Date: Re: object system...
- Previous by thread: Re: object system...
- Next by thread: Re: object system...
- Index(es):
Relevant Pages
|