Re: Confused by behavior of LispWorks for code snippet



Damien Kick <dkixk@xxxxxxxxxxxxx> writes:

With LispWorks Personal Edition 5.0.1 (PowerPC), I see the following
behavior:

<code>
CL-USER 1 > (defparameter *x* #'(lambda () 13))
*X*

CL-USER 2 > *x*
#<anonymous interpreted function 200EB012>

CL-USER 3 > (setf (symbol-function 'f) *x*)
#<anonymous interpreted function 200EB012>

CL-USER 4 > (setf (symbol-function 'g) *x*)
#<anonymous interpreted function 200EB012>

CL-USER 5 > (eq #'f #'g)
NIL

CL-USER 6 > (eql #'f #'g)
NIL

CL-USER 7 > (eql #'f *x*)
NIL

CL-USER 8 > (eq *x* *x*)
T
</code>

I would've expected (EQ #'F #'G) => T. I am getting that result,
i.e. (EQ #'F #'G) => T, with both Allegro CL 8.0 and Clisp. Is
LispWorks mistaken or is that an allowable result?

I don't think the FUNCTION special form is required not to cons, and
hence I'm not sure object identity across a store/retrieve is
guaranteed. I'm quite sure this was questioned, and whatever the
outcome, I know it's controversial. I'm just saying I think it was
the status quo from CLTL that it might cons (perhaps only because I'm
pretty sure it was not specified there), and I don't recall X3J13
fixing that. (If I'm not remembering some change we finally made,
someone please cite it.)

But, as I recall, the reasons why a rational implementor might not have
wanted to guarantee you expected behavior above include...

The function space needs to be quickly callable more than it needs to
be quickly something you can grab something out of. So some
implementations may have wanted to store something there that was
unwrapped and needed to be wrapped on access. (Normally you'd expect
they could use a tricky offset scheme to hide that, but that's really
quite dependent on the hardware architecture and addressing modes for
the processor in question--recall that CL is not written for a
specific architecture.)

It's probably fair to say that most functions are never accessed by #'
so the space it takes to represent a fully wrapped version of every system
function may have been considered something to be omitted. Lisp used to
get hammered by other languages for how BIG it was, and forcing a space
use to hold a full wrapper may have been tough. The function cell may be
unboxed, and boxing it may cons. You could imagine the process of setting
symbol value to be an unboxing step, storing unboxed data in the symbol (or
a table indexed by it) because a second lary of boxing wasn't needed.

Also, if I recall correctly, there was specific problem with the transition
from CLTL to ANSI CL regarding the fact that (LAMBDA (X) X), the list, used
to be a function, and no longer was to be a function. Clearly, even when it
WAS a function, this was problematic since you can't [easily] make the list
datatype executable, so putting it in the function cell was a barrier to
speed. But this meant that in CLTL if you did
(SETF (SYMBOL-FUNCTION 'F) '(LAMBDA (X) X))
you had a serious problem. You had to either wrap it with an object that
allowed it to be jumped to directly or coerce it to a fast object that replaced
it. In the former case, extraction as a value even when there was no such
object was slowed down because it wasn't a simple move instruction but now was
something that had to sniff at the value there and decide whether to
provisionally unwrap it, returning the list (LAMBDA (X) X) again... or else
it had to be allowed to return #<FUNCTION (LAMBDA (X) X)> even when that was
not what had been stored. So in CLTL things were a mess, which is why I'm
pretty sure CLTL semantics did not require equality. But then when we moved
to ANSI, I'm not sure this part was cleaned up, even though it could have been;
there was the issue of backward compatibility, which my recollection was we
left to vendors.

Underlying the entire issue is the question of whether anyone should
be storing in the function cell for the purpose of object identity.
I think good arguments can be made on both sides, so it's one of those
language design choice points one has to think hard about it.

I've even heard rational arguments that functions should be among the short
list of datatypes, along with numbers and characters, that are super-primitive
because of their need to be operated on by raw machine instructions, and
that therefore should be always possible to store in "dirty registers" and
hence might sometimes be split or coalesced in weird ways... for example, in
MACLISP (which pre-dated CL), small integers (the bottom range of fixnums,
but not all fixnums) were interned, and it was USUALLY possible to use EQ
on them, but even so, everyone knew not to lean very heavily on this because
sometimes in what we called number-compiled code (that is, code with heavy
declarations where you really wanted the compiler to be shifting things freely
among registers), it was critical to be able to move an integer into a
non-pointer register, where it would lose its identity... and might even share identity with some other integer that was also using that register. Rather
than have the presence of a nearby EQ force such optimizations not to be done
(for fear the wrong result would occur), we just weakened the definition of
EQ. And I've heard suggestions this should be true of functions. (I've also
heard suggestions functions simply shouldn't be possible to compare with EQ
at all.) I don't think EQ should ever signal an error under any circumstances,
but while I tend to think functions should have identity and not be subject
to the slippery-identity thing that integers and characters are, I do think
the case is not as clear in that case.

Whether any of this is why LW does what it does, I can't say. I'm
just speaking general abstracts. Does any of this make you more comfortable
with the fact that this might be legitimately different among implementations?
.