Re: AST's in assembler




KJH wrote:

Absolutely I don't mean optimizing what has written by the programmer.

Okay, that's a start.

Assembly should be what you write is what you get.

Not necessarily. Instruction scheduling is a good example of something
that is better left to the machine in the general case. As long as a
programmer has the ability to turn such optimization off on a
case-by-case basis, I see no problem with it.

I was referring to
idea of the assembler being able to have a "big picture" of the source.

Okay.

Not something which can be achieved simply by making two-passes over
source.

And any specific number of passes, depending on the size of the
program.


Symbol table with offsets is not enough. Assembler still has to
rescan the source when making additional passes, right?

No. Many assemblers *do* use an internal representation so that then
don't have to pay the price of lexical analysis and parsing on later
passes (actually, "phases" is the correct word here). However, unlike
the AST many compilers produces, the internal representation of the
source file is often very close to the original code.



Even if it's in
memory mapped files (does anybody even use strictly file based approach
nowadays?).

Sure. If you only look at the source file during the first phase
(generating your internal representation), then a file-based approach
is fine. Because of the amount of memory modern machines have, most
assemblers just build the object code up in memory and then write the
whole thing out using standard file I/O (though, granted, that could
just as well be memory-mapped I/O).


It can do that even with source code. Memory-mapped files, for example.


But source is one thing. When you do additional passes over source, you
still don't have anything more than maybe offsets of labels figured
out. Assembler can't really go forward or backward in source and it's
semantics, it just scans line by line, and consults symbol table when
it needs to.

Actually, HLA v2.0 does use memory-mapped files and it doesn't jump
around in the source code. For example, when processing macros one
could just save a pointer into the source file and re-read that source
(hopefully in a memory-mapped file) whenever expanding the macro.

I'm thinking about the situation where there is a
"complete" representation of source in some intermediate representation
in memory, and the assembler is able to casually go to any point in
that program and figure out more at once, than in line based approach.

Memory-mapped files certainly allow this. Though this is independent of
the use of an internal representation. Other than performance reasons,
there is no reason why could couldn't work directly with the source
code this way.




Well I'm the guy.

Okay, my bad. :-)

As far as I have seen considering TASM, somethings
are cached, but basically it's a dumb rescan of every line on every
pass, although involving linked list which meaning are yet to be
determined.

And yet, amazingly enough, TASM is the fastest assembler out there.
Still! Kind of tells you that there is lots of room for improvement in
all the assemblers written to date, eh?



But wouldn't you agree that excluding concatenated lines, basically
line ending marks the end of one asm statement?

Depends entirely on the assembler's syntax. E.g., in HLA it's prefectly
legitimate to write

mov
(
0,
eax
);

Like C and other HLLs, the semicolon marks the end of the statement,
not a line ending.

Technically, line endings don't even end a statement in a traditional
assembly language. They are convenient synchronization points (useful
for error recovery, for example), but you could easily develop an
"Intel Syntax" grammar that ignores new lines just like any other
whitespace.


Yes.
But many of the assemblers you're probably looking at have been hacked
up by people who've never before written a complex compiler system.
IOW, they probably don't know what you're talking about.


I understand. I barely know myself what I'm talking about :)

Well, you know what an AST is. And that puts you in a class above some
people.
Cheers,
Randy Hyde

.



Relevant Pages

  • Re: ASTs in assembler
    ... Absolutely I don't mean optimizing what has written by the programmer. ... idea of the assembler being able to have a "big picture" of the source. ... Because of the amount of memory modern machines have, ... Why not read macro bodies from macro symbol table? ...
    (alt.lang.asm)
  • Re: What Basic compiler should I buy??
    ... > than assembler for the same task. ... > message display might use printf, or 1 of its near relatives, to ... > you run out of memory. ... language) programmer and the assembly programmer. ...
    (comp.arch.embedded)
  • Re: C++ sucks for games
    ... > mistakes it is possible for a programmer to make (although of course memory ... > references and memory management is a bit of a red herring here. ... you would use smart pointers too. ...
    (comp.lang.lisp)
  • Re: The Case Against RosAsm (#7) (LONG)
    ... If your assembler makes this ... if the programmer is "ambiguous" about the size of the jump, ... that's probably why all the assemblers that optimise branch ...
    (alt.lang.asm)
  • Article - Buying a computer? Ask these 3 questions!
    ... unit (monitor, processor, memory, disk space). ... reliable assembler to purchase the components ... top up the warranty period. ...
    (microsoft.public.windowsxp.general)

Loading