Re: How much tuning does regular lisp compilers do?
- From: Duane Rettig <duane@xxxxxxxxx>
- Date: Thu, 04 Sep 2008 09:02:33 -0700
rpw3@xxxxxxxx (Rob Warnock) writes:
verec <verec@xxxxxxx> wrote:
+---------------
| rpw3@xxxxxxxx (Rob Warnock) wrote:
| [a very detailed explanation of his Lisp on Atlon cache aligment
| experiments]
|
| That's certainly more than what I expected, but then begs the
| question of how realistic such improved "cached aligned" loops
| might ne in the real world:
|
| Given the byzantine number of bytes per opcode requirement of
| the x86 ISA, it is very likely that most loops will need more
| than 16 bytes of code between the branch-back and the top of
| the loop, thus incurring a systematic "16 bytes fetch cycle"
| miss each and every time through the loop.
+---------------
In my conversations with people who *are* experienced compiler
writers [I'm not, just an amateur there], I've been told that
for algorithmic kernels keeping loops 16-byte-aligned is an
easy way to get a small but measurable improvement. I would
expect that on modern x86 (and x86_64) machines that the penalty
for misalignment would be only one or two cycles. Whether that
is important enough to you (generic "you") to affect your buying
decision about which compiler to use is something you'd have
to answer for yourself. But compiler writers *do* worry about
such things.
It's not only about time to load the cache line, but how many you'll
be loading; if a loop runs within a cache line but starts in the
middle of one, then two cache lines are consumed, and that leaves one
less physical line in the cache for swapping out when the LRU takes
over (presumably these two lines are recently used, and so will not be
flushed themselves). So a misaligned loop makes the whole machine
slower, in a sense, by giving it a smaller available cache.
But to answer the unstated related question: yes, there have
been cases historically where the compiler had to pay attention
to page crossings as well as cache line crossings. [E.g., the
infamous early MIPS R4000 bug that showed up if a jump was the
last instruction on a page, "fixed" with a change to the linker!!
The SGI compiler included a "-no_jump_at_eop" option in "ld"
for a while.]
We refused to buckle to this kind of hardware bug; it took us a year
to get SGI to even admit that they had a bug (they had cleaimed to
have fixed the bug with software, but that fix counted on the code
being at a fixed location on the page). And because we had some large
customers behind us, we finally got them to satisfy the bug issue by
noting that the number of people with that particular lot of R4000s
who also were using lisp were in the single-digits, because most of
them had already moved on, so we got them to agree to replace the
chips of anyone who requested such a replacement in a certain way.
--
Duane Rettig duane@xxxxxxxxx Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
.
- Prev by Date: Re: typo in hyperspec for UNLESS
- Next by Date: Re: What do you LISPers think of Haskell?
- Previous by thread: Re: How much tuning does regular lisp compilers do?
- Next by thread: Re: the jargon lisp1 vs lisp2 [was: function aliasing]
- Index(es):
Relevant Pages
|