Re: How much tuning does regular lisp compilers do?
- From: "Thomas F. Bur***" <tbur***@xxxxxxxxx>
- Date: Sun, 31 Aug 2008 04:09:54 -0700 (PDT)
To the point you made in your earlier post, about profiling being
"interesting" with a copying GC ... of course, most of the time,
alignment issues are going to play a very minor role (it's essentially
the noise in macro-optimization); but when I have had micro-
optimization concerns, trying to get the inner loop of an altorithm to
fit into the i-cache, obviously there you do care about alignment. The
way to work around the copying GC is to have your build process result
in a dumped core, with your code in static space. Then you can look to
see if it's aligned as you need it.
On Aug 31, 6:04 am, r...@xxxxxxxx (Rob Warnock) wrote:
Bakul Shah <bakul+use...@xxxxxxxxxxxxx> wrote:
|
| Looks like you are saying
| a) loops should be aligned on a 16 byte boundary
| b) A copying GC should maintain function alignment modulo 16
+---------------
Exactly.
+---------------
| That shouldn't be too hard to fix, right?!
+---------------
Well... In theory, yes, but "theory vs practice", you know. ;-}
You don't want to have to force conses to 16-byte boundaries,
since on a 32-bit machine that would *double* the space taken
by them [though on 64-bit it's a no-brainer], so you'd have
to do some sort of BiBoP-like (or at least segregated) allocation
in the GC. And then there's adding algorithms to the compiler
to know *which* loops are worth to forcing alignment on (inner
two levels only, probably), and then adding data structures to
the compiler to keep track of that where it's needed, and to
glue that information together with the part of the compiler
that stitches generated code together [since the current CMUCL
compiler is mostly template based, viz. "VOPs"]. And then modify
the FASL loader & format to preserve that alignment on loaded
compiled code. Etc.
I think you're making this harder than it has to be. It would be
enough if the GC knew that code vectors need to be aligned to some
coarser granularity than other objects. If it needs to, it would waste
an extra 8 bytes when moving a code vector, but that's not a big deal
-- conses would stay unaffected, and it wouldn't be a huge, pervasive
change like partitioning the heap. If you know that code vectors are
always 16-byte aligned, it would be fairly simple to insert noops to
align given bits of the code to certain boundaries. No changes needed
to the FASL loader or anything. The big deal would be adding loop
analysis to the compiler so it could figure out *which* blocks are the
ones where alignment matters.
Certainly "not too hard" for a funded project by a small team
of experienced compiler writers, but I suspect that resource
isn't currently available for CMUCL.
There I agree with you. Adding loop analysis to sbcl has been
threatened in the past, and if someone gets around to it, this is the
sort of thing that could quite easily come out of such an effort.
.
- Follow-Ups:
- Re: How much tuning does regular lisp compilers do?
- From: Rob Warnock
- Re: How much tuning does regular lisp compilers do?
- References:
- Re: How much tuning does regular lisp compilers do?
- From: Rob Warnock
- Re: How much tuning does regular lisp compilers do?
- From: Bakul Shah
- Re: How much tuning does regular lisp compilers do?
- From: Rob Warnock
- Re: How much tuning does regular lisp compilers do?
- Prev by Date: futures in common lisp
- Next by Date: Re: futures in common lisp
- Previous by thread: Re: How much tuning does regular lisp compilers do?
- Next by thread: Re: How much tuning does regular lisp compilers do?
- Index(es):