Re: Compilation Was: Is DEFCONSTANT broken?



On Jun 25, 1:27 pm, p...@xxxxxxxxxxxxxxxxx (Pascal J. Bourguignon)
wrote:
Duane Rettig <du...@xxxxxxxxx> writes:
On Jun 25, 1:28 am, p...@xxxxxxxxxxxxxxxxx (Pascal J. Bourguignon)
wrote:
Duane Rettig <du...@xxxxxxxxx> writes:
The problem here is that the whole reason for
compilation is to reduce code complexity, and having these hooks to
break these simplifications goes against the goal of compilation.  

This surprize me.

I thought that the purpose of compilation was foremost to optimize
speed.

No, although that's usually a by-product, and thus the user tends to
make that a goal.

We wouldn't want to merely translate to a different target machine, if
it didn't allow to run faster.  There'd be no money in compilers.

There's money in compilers? :)

So if it's not porting a program from one VM to another, and it's not
for speed, why do we want to compile?

Of course _you_ want to compile for speed. But you're looking at my
original paragraph from the wrong point of view; it was never meant to
explore the reasons why _we_ use compilers, but instead from the point
of view of the compiler itself. I suppose I should have used the word
"goal" or "purpose" instead of "reason", as if to say what the
compiler's goal is. Our previous conversation about defconstant
presupposes the need for the compiler, because without the compiler
most of the issue goes away. So I was using "reason" here in more of
a limited sense which might be more appropriately described as "goal"
or "purpose", not in the larger sense of "why is it useful"...

I'm not sure that it reduces code complexity.  For a start, assembler
is harder to read than high level language.

Harder for whom or what?   It's certainly easier to read for the
executing machine.  And we're not talking about readability, but
complexity.  And, by the way, assembler code is not compiled code; it
is also source code which must be assembled in order to execute
directly on a machine.  You might consider assembler to be equivalent
to machine code, because you usually have excellent disassembling
tools, which reintroduce the complexities of mnemonic names,
connections between jump instructions and their target labels
(remember, though, that labels don't actually exist in machine code -
they are just instructions which happen to be the target of the jump).

Yes, with plain assemblers and simple enough processors, there's a
quasi bijection between assembler text and binary.  But I'm aware of
the difference, only I don't think it's signficative.  My bet is that
assembler and binary have a complexity ratio close to 1 (see below).

Yes, close to 1, but I was only making a point. We tend to think in
terms of assembler, but that is because we tend to have many tools
that can perform the abstraction needed to look at assembler source
rather than binary code. Have you ever been on a machine which does
not have gdb, with all of its disassembling capabilities, and had to
perform the calculations of what instruction you're observing, or if
it happens to be a branch instruction, to what address the branch
location points? A few hours working like this might not change your
mind as to the one-to-one nature of assembler to machine code, but it
would certainly give you a different perspective. Long before I
became a software engineer (and even before I had become a hardware
engineer before that) I developed such an appreciation for assemblers
and disassemblers by trying to hand-disassemble hex dumps of old 6802
programs...

  Then the compiler may
open code, inline, unroll loops, and implement any other kind of trick
which dilutes abstractions a lot, which I understand as rendering the
code more complex, not less.

Actually, quite the opposite.  A loop is more complex than straight-
line code; it has a branch instruction within it, which is well-
understood to create pipeline stalls, unless the hardware has the
added complexity itself to duplicate effort in order to perform branch
prediction.   No, the simplest code is the fully-unrolled loop which
executes no branches, but which simply performs the required
instructions.  Is it necessarily optimal? No, because code size does
factor into the mix, and so the fact that loops are only unrolled
partially is due to tradeoffs between code complexity and instruction
cache size.

We fight complexity with abstractions.  Which is mostly inventing new
levels of source code generating the existing lower level of lisp, VOP
or binary.  If adding an abstraction didn't reduce the complexity we
wouldn't do it, we'd keep programming in assembler.

Yes, you're thinking about this backwards. Abstractions are in the
upward (i.e. on the scale of low-level to high-level) direction, which
reduces complexity for the user, but which also _increases_ complexity
for the tools that are performing the abstraction for you.

Consider a simple example at the hardware level: If you were wanting
to write a cos() function in assembler from scratch, on a RISC machine
it would be a very complex set of assembler instructions, which would
likely be fairly large in code space (at least several thousand
instructions (in fact, on an Alpha it seems to be about 3600 bytes).
In a CISC machine (where, interestingly, the first C stands for
Complex), where the machine has increased its complexity so that the
user doesn't have to deal with it, the disassembled cos function is
more like 38 bytes, most of which is taken up in handling edge cases
that the fcos instruction (on x87 hardware) doesn't handle. Again,
from the user's point of view, less complex. From the compiler's
point of view, more complex.

So, in going the other direction, from source code to assembler/
machine code (or, in the case of cos, from the single complex
instruction to the thousands of instructions that would have been
required to implement it if it weren't there), the direction from high-
level to low level is one of increasing simplicity. More
instructions, yes. But also more simplicity. The 3600 bytes worth of
instructions to implement cos on the alpha are all much simpler than
the single fcos instruction on the x87.

[snip]

I don't buy a correlation between compressibility and simplicity any
more than I buy a forced correlation between LOC generated and
paycheck. Numbers can be generated and show whatever you please, but
there are too many factors involved to draw such a simplistic
conclusion.

Duane
.



Relevant Pages

  • Re: Compilation Was: Is DEFCONSTANT broken?
    ... compilation is to reduce code complexity, ... break these simplifications goes against the goal of compilation. ... And, by the way, assembler code is not compiled code; ... they are just instructions which happen to be the target of the jump). ...
    (comp.lang.lisp)
  • Re: Why We Use C Than C++...
    ... >> object orient design is something that comes naturally to many programmers ... >Complexity with more complexity until no one on earth, ... >Man years of toiling to build incredible fragile compilers ... I would never write a kernel with dependencies on a garbage collector. ...
    (comp.lang.c)
  • Re: C/C++ guidelines
    ... The short answer is a lot better than C99 standard compliance! ... given the complexity of the language. ... which pushes the limits of most compilers. ...
    (comp.lang.c)
  • Re: A quick c programming survey question
    ... It comes as a suprise that people would 'recommend' it. ... > compilers and targets it is the method that generates most efficient code. ... This complexity can easily be ... complex tasks once and not move then to the everyday function. ...
    (comp.arch.embedded)
  • Re: Philosophical question: Is there a limit to the complexity of a program humans can write?
    ... There are limits where the complexity of the program begins to ... Here, try a code generator: ... that is really tough for some compilers to handle. ... Prev by Date: ...
    (comp.lang.fortran)