Re: Switch from SBCL to Erlang backend due to scalability issues(GC).



Duane Rettig <duane@xxxxxxxxx> writes:

"Wolfram Fenske" <int2k@xxxxxxx> writes:

Does anyone know why CMU-CL and SBCL do that? This snippet only says
that they think a conservative GC is probably not slower than a
precise GC on x86. It doesn't say they think it's faster. That
sounds to me like the reason was not efficiency. What was it then?

The tradeoff is runtime efficiency. A conservative gc is needed when
not all lisp locals (i.e. tagged values dedstined for the stack) are
initialized at the start of the function.

Ah. I've seen this in OCaml, which uses precise GC AFAICT. In the
parts of it that are witten in C and in C extensions (FFI), you have
to use macros to declare local variables. Besides registering the
locals with the GC they also initialize them with NULL.

Given register allocation techniques that bound live ranges of
variables only when they are needed, it is quite common to see a
stack location never touched by the function as compiled.

I don't quite understand this statement. Isn't that a contradiction
to the SBCL developers saying that they don't use a precise GC on x86
because it is register-poor? If there a few registers, many locals
will be read from and written to memory instead of a register. So it
should be more likely that their stack locations have to be touched
than on architectires with more registers. Or are you saying you
usually need more stack locations on x86 because fewer variables
reside in registers, and initializing all those stack locations would
be too expensive?

There is overhead in initializing such variables; small, to be sure,
but that overhead is constant

I was under the impression that memory access is more efficient on a
CISC architecture because it is needed more frequently than on RISC's
(I may be wrong, though. I'm not very knowledgeable in this area.).
This should reduce the overhead of initializing local variables with
NULL on x86.

and is a function of the complexity of the whole function, rather
than of the portion of the algorithm that is active. So in
optimizing a particular larger algorithm, with conservative gc
(i.e. lazy initialization of variables) you could consider a
function's efficiency based loosely on the complexity of its data,
whereas if all variables must be pre-initialized, you would have to
include the function's number of internal variables in order to
decide how to maximize your run-time. That would tend to lead to
the breaking up of functions into smaller chunks, which shouldn't be
necessary.

I see.

[...] (rest of the posting snipped)

--
Wolfram Fenske

A: Yes.
Q: Are you sure?
A: Because it reverses the logical flow of conversation.
Q: Why is top posting frowned upon?

.



Relevant Pages

  • Re: Switch from SBCL to Erlang backend due to scalability issues(GC).
    ... should be more likely that their stack locations have to be touched ... than on architectires with more registers. ... and initializing all those stack locations would ... CISC architecture because it is needed more frequently than on RISC's ...
    (comp.lang.lisp)
  • Re: non load/store architecture?
    ... Programmers can write great code on a RISC ... Especially in a PC marketplace dominated by the x86. ... Lots of registers and an orthogonal instruction set are important - both have these. ... The lack of registers means much more memory IO, which causes stalls and requires complex scheduling. ...
    (comp.arch.embedded)
  • Re: Intel publishes Larrabee paper
    ... Using x86 as the base it is understandable that the third vector operand ... 32 registers times 4 sets, plus some rename registers and you are ... visible registers your actual register set becomes large, ... The die size tradeoffs of 4 way multi threading is mostly about reducing ...
    (comp.arch)
  • Re: Relocatable/PIC asm (was intra-segment CALL and JMP)
    ... I'm only writing in x86 assembly right now, ... data back and forth between registers and memory. ... instructions, especially if the instruction is "pairable", i.e., ... Keeping the top two stack items in registers ...
    (alt.lang.asm)
  • Re: More blather about atom
    ... execute on whatever OOO core keeps your x86 workload happy. ... Itanium is so big and complicated, and has so many registers, that it ...
    (comp.arch)