Re: object system...
- From: "cr88192" <cr88192@xxxxxxxxxxx>
- Date: Fri, 9 Jan 2009 00:01:09 +1000
"Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx> wrote in message
news:1svcuf8jnljq$.1g6jq9vst71p2.dlg@xxxxxxxxxxxxx
On Thu, 8 Jan 2009 20:35:22 +1000, cr88192 wrote:
"Dmitry A. Kazakov" <mailbox@xxxxxxxxxxxxxxxxx> wrote in message
news:1iezboh1oejdf.x4om15uvsv7j.dlg@xxxxxxxxxxxxx
On Tue, 6 Jan 2009 01:08:54 +1000, cr88192 wrote:
but, oh well, I figure I will present some of the ideas behind an
object
system I had implemented and am making use of in my projects, and maybe
people can comment if some things are good or not so good.
basically, the core of the system is this:
Single-Inheritance Class/Instance OO (MI is more-or-less supported as
well, but it is hacky and not well supported by the APIs, so I mostly
ignore
MI); there are Interfaces, like in Java and friends;
...
but, it differs from the normal model some in a few ways:
interfaces can implement slots as well as methods;
Slot = member? Implemented as a getter/setter pair?
yes. no.
the object system can access slots/members/fields directly from
interfaces
as if they were normal instance slots...
crap, there are a lot of possible words for this simple idea:
slot;
field;
member;
variable;
...
May I suggest just: "abstract record interface." You have a type with the
operation ".". The operation can be implemented by the user. That's it.
modifiable operators like this don't exist in most languages, and would
present a hassle for the compiler (but, if I am focusing more at the layer
of the VM, I need not care as much over whether the HLL implements operator
overloading or not...).
it is technically possible to use any interface with any compatible
class
(due to a side effect of how the interface mechanism is implemented,
the
same interface will work on a class regardless of whether or not said
class implements the interface as such);
Egh? Does it mean that there is no manifested declaration that a type
implements the interface, so that anyone can call anything on everyone?
Looks like a mess.
actually, more like, the object system does not bother to check that it
does
so, and so everything that is not prevented (or will cause something to
break if done), is, by implication, allowed...
= untyped
at this level, yes...
the notion of interfaces binding to classes is still kept as a kind of
vestige though, and other code can check for this and enforce this as
needed...
so, it is a tradeoff:
as far as actual coding practice goes, doing something like this is
rather
ugly; but as far as using it for the implementation of other languages,
or for
using classes and interfaces outside of what they are normally used for,
these kinds of things can be rather useful...
I don't think so. It is a question if there is need to be a layer between
the hardware (untyped) and the higher-level programming language (typed).
I
don't feel it necessary.
potentially, but all this is operating below the level of the VM, and what
will probably be done is what the VM specs' say needs to be done.
so, for example, if the VM spec says this throws an exception, then it can
be checked for and an exception is thrown...
it is possible to modify class layout at runtime, and the object system
machinery will work to keep everything consistent (internally, a
versioning system is used, so an older object may be layed out
according to an older
version of the class), however doing so will reduce the performance of
the class(es) in question;
Do you mean representation in the memory? Why do you want to modify it?
It
does not look like a workable idea when considering packed containers
etc.
For instance an array of Booleans.
an example of a situation in which the layout of a class would be
modified
would be, for example, loading a new version of the class from a file, or
rebuilding an existing module.
New version of a class is another class. But you are already untyped, so
who cares.
not exactly...
the new version keeps the same identity as the originan, and non-structural
state (such as static variables and overloaded methods) is shared between
the old and new versions...
say, the user has been working on a piece of code, and then their changes
have been committed, and these changes happen to include revising the
layout
(such as by adding new slots or methods, ...).
... removing slots, methods etc. Why old instances should work with new
implementations of new class?
there are reasons for needing to keep class identity.
consider a VM, say, the Java VM, has certain classes, say,
'java.lang.String', which need to exist prior to actually loading the class
file in which the class is defined.
so, early on the VM, a simple fake version of the class is implemented, and
once the actual version is loaded, it patches over the original. Now,
suddendly, all of the old instances become fully upgraded to their new
standing as an actual class with actual methods...
so, versioning is used such that the different layouts can coexist, while
still sharing everything else (class identity, slot and method handles,
....). externally, it doesn't look like much of anything has changed...
The problem exists only because you go untyped. If your system where
typed,
there would be no problem, because competing implementations of the same
interface could coexist.
not sure why this would be...
but, it is also the case that, at the level of the implementation, the
entire object system boils down to a big mass of C code and periodic chunks
of assembler...
attempting to make use of newer slots or similar may cause the object
itself
to be re-packaged using the new layout (and, thus, brought up to date
with
its class).
In a typed system old object would substitutable in the methods of its
subtype without further actions. Changing representation, if necessary
would happen transparently upon type conversions (forward and backward).
well, at this level, the class is bound physically to the layout.
changing the object class thus involves also a change in structure, and very
likely, a change in object identity.
the result then is not the old object with a new type, but a new object with
some of the old state...
it is possible to add slots to individual instances of an object,
allowing the emulation of some aspects of Prototype-OO style
What for?
Prototype OO has some capabilities that Class/Instance OO does not...
C/I is also unable to effectively facilitate some non-C/I languages, such
as
JS, ...
Example?
dynamic delegation, ability to use an object the same as a dictionary absent
needing some separate kind of collection object, ability to dynamically add
slots and methods to an object, ...
also, it has the ability to implement a language like JavaScript without
introducing a semantic barrier (which is what would happen if implementing
these features on top of a C/I system, or as a separate object system).
for example:
var obj={x: 3, y: 4};
obj.len=function() { return(x*x + y*y); };
var z=sqrt(obj.len());
var obj1={x: 5, y: 6};
onj1.parent=obj;
var z1=sqrt(obj1.len());
z is 5. z1
z1 is 7.81...
now, to implement the above with a pure C/I approach would end up creating
something fundamentally different than an object containing the slots in
question.
now, one can argue that still, since it can be built on top of, it is not
beyond the abilities of, C/I OO. fair enough, only that, by this definition,
I can also call C and ASM OO on the grounds that I can implement an onject
system in them given sufficient effort...
(as well as the ability to "clone" objects, ...),
You mean a polymorphic copy? Where is a problem? Except that one must
separate classes and types in order to be able to do so. Does your
system
separate them?
a clone of the object has the same type and state as the original, but a
separate object identity.
That is called assignment. x := y; BTW, not all types should have
assignment.
it is called cloning in Prototype-OO terminology...
but, this may be because there is a difference in the notion as to what is
an object (in P-OO, an object is never considered to be a value type, which
is what is required for assignment to be considered a means of duplicating
an object).
there is also support for object delegation (like in Self or
JavaScript),
but this feature can at present only be used via interfaces (this is
mostly for technical reasons);
I see, delegation when there are interfaces to members looks technically
easy. The problem is syntax sugar for.
I am not sure I know what you mean.
How to express in the language a wish to delegate, A of X to B of Y.
I have not specified a language-level manifestation of all this, however,
were I to do so, it would probably be via special keywords...
note that a delegate link would be sort of like class inheritence, but done
between 2 live objects.
in my implementation, the objects can't normally directly see each other
(from the POV of each object, the delegation like looks just like another
slot).
so, how delegation is actually used in my implementation is through an
interface, which as-needed can also follow delegation links.
now, the implications of this are subtle:
very possibly, a JS implementation would not access objects directly, but
would instead create bunches of interfaces containing the slots and methods
in question, and then interact with objects primarily via these interfaces.
AFAIK, this is similar to how some of the existing languages targeting the
JVM, such as JRuby, operate...
there is also support for special assignable methods, which exist in
the
instance rather than in the class (the method is actually directed
through a
slot rather than through the vtable).
Shudder. Methods do not exist in either. Vtable is a poor design. What
about a sane multiple dispatch instead of the mess.
multiple dispatch could be built on top of the system, however, single
dispatch and vtables are used as these are the highest-performance
options
(handling multiple dispatch dynamically can be expensive...).
No, MD cannot be implemented on top of SD. Consequently vtable cannot
serve
as a dispatching table of a MD method. vtable implements mapping type tag
-> method. In MD you have a Cartesian product of type tags of the
arguments, i.e. tuple -> method.
not sure who not, but I would probably have to look into it...
worst case, MD could probably be added as a tweaked out method handle, which
when invoked looks up the correct method for the passed signature (could
probably be optimized via a hash table or similar...).
In my view interface as a separate concept brings only disadvantages.
There should be only types and classes of.
I am not sure I see the reason for this view.
Because it is inconsistent and extremely inefficient to mix them. An
example is C++, which re-dispatches all the time and keeps type tags in
all
objects.
C++ does not have interfaces, so I don't see the problem...
then again, in C++, MI is often used to similar effect as having
interfaces...
however, interfaces do avoid some of the complexities of MI.
I did check before, but with my model it is possible to make use of
many
of these features while still being within the realm of what could be
statically proven to be correct (although keeping things statically
provable would place many other restrictions on everything as well, for
example,
placing restrictions on which objects can be assigned to which
interfaces, either by requiring the class to 'implement' the interface,
or by
validating that the class is compatible with the interface, ...).
It is not clear what features you mean. Many of them are implementation
issues.
my whole object system is an implementation issue, but I was describing
some
of the impacts of this implementation...
Impacts on the design? That looks like a cart before the horse.
an object system is a means to an end, not the end in itself...
it matters that it can get things done, and get things done efficiently, but
the exact semantics are not so imprtant so long as they don't help or hinder
its utility in terms of getting things done...
so, then, the semantics are a reflection of what I most want to get done,
and what can be most effectively done within the confines of the
implementation... (a change in implementation would I think naturally imply
a change in the semantics of the system).
so, then, the semantics and/or design of something are variable based on the
context in which it is considered.
for example, in one context a screwdriver may be interpreted as a
screwdriver, and in another as a pry-bar or chisel... we may still call it a
screwdriver, but it takes on the same role (or, if you will, semantics) as a
pry or chisel, by the context in which it is applied (say, wedged between 2
pieces of metal and being hit with a hammer...).
I implemented a Class/Instance system then, because this is what most VM's
implement, and also because a C/I system is most capable of delivering the
best performance (Prototype systems, OTOH, tend to be far worse in terms of
performance concerns...).
For static analysis, it is important to have a manifestedly typed system
+
fine grained types. If you throw everything into one type-class there is
little one could check in the end.
not sure I follow.
Static analysis is based on the information provided by the user. More
declarative information you provide better analysis becomes possible.
In particular, static analysis of dispatch target becomes possible if it
is
known whether the object is polymorphic or not. If these are of two
different types, e.g. class (polymorphic) and specific type
(non-polymorphic), then you can always statically decide on dispatch.
ok, yes, ok...
yes, my system generally takes types into account, but often gets a little
lazy WRT object types (where the processing machinery will often get lazy
and look no further than "object", except in those cases where the specific
type of object becomes relevant, such as to know the type of a slot or to
get info about a method or similar...).
performance could be better, but it is still "acceptable"...
What I expect from a good system is zero performance hit when no
dispatch
happens. This is why type/class separation is so important to me. If you
deal with only types, which is about 80-90% of all calls, you do not
need
to dispatch and so there is no any overhead (the target is statically
known).
Another requirement is that dispatch never fails, i.e. classes are
statically checkable.
Yet another is hard read-time constraints of call overhead.
if you mean built-in value types, yes, I have these...
No.
actually, given how everything exists, they are essentially unavoidable
anyways...
I wish same performance for all kind of types. It is unacceptable when
user-defined types inflict some performance loss. There is something
conceptually wrong with the types system then.
errm, these issues are usually intrinsic.
user defined types are tpically slower, if anything, because they are not
natively implemented on the processor.
so, for example, 128 bit integers will be slower than 32 bit integers, and
within reason, this much is unlikely to change...
note that a major central structure in the object system is a few large
hash tables, which are the primary means by which things like interface
lookups, ... are performed (hashes are computed using the classname,
slot/method
name, and method signature, ...).
Performance of hash tables is less predictable, you cannot actually tell
how long it would takes to call a method if that goes trough a hash
table.
But in the first place I don't see why it should. IMO this stuff should
not leak beyond the compile time.
actually, the whole system exists at runtime, because the entire
framework
makes heavy use of dynamic compilation and JIT for everything...
Is this existence required? Compare, it is required presence of the
programmer, while the program is running... You cannot store the
programmer
on the flash, it has to be payed, feed, needs some time to sleep.
well, otherwise I need to resort to other means to accomplish the same
tasks...
hash tables are a good and generally reliable mechanism IME, and are much
preferable to other options (such as linear or binary searches, ...).
but, anyways, the performance of the hash-tables is strictly bound:
if any lookup takes more than a certain number of steps, then the whole
table is expanded and all of the items are re-hashed...
= unbound (re-hashing requires time, lot of time)
but, it is rare enough to be safely ignored (and my initial hash tables are
fairly large...). more so, usually rehashing is a fairly quick operation
(not something you would want running all the time, but not really expensive
enough to be that much worth worrying about, given its relative rarity...).
if it really became a problem, the process could be offloaded to a separate
thread sort of like the garbage collector...
but, yeah, my current limit is 16 steps...
if any more than 16 steps is needed, then a resize and rehash is invoked...
luckily, all of the objects hashed have precomputed hash values, so it is
unnecessary to recalculate the hashes for the names.
I actually simulated the hash tables externally, and find-tuned the algos to
minimize collisions, ... and, usually, with a fairly large number of keys (I
actually had to resort to using recombination of the keys just so that I
would have bunches more with which to run the tests...).
at the time, performance seemed plenty acceptable...
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
.
- Follow-Ups:
- Re: object system...
- From: Dmitry A. Kazakov
- Re: object system...
- References:
- object system...
- From: cr88192
- Re: object system...
- From: Dmitry A. Kazakov
- Re: object system...
- From: cr88192
- Re: object system...
- From: Dmitry A. Kazakov
- object system...
- Prev by Date: Re: object system...
- Next by Date: Re: object system...
- Previous by thread: Re: object system...
- Next by thread: Re: object system...
- Index(es):
Relevant Pages
|