Re: from elsewhere, an assembler




"Wolfgang Kern" <nowhere@xxxxxxxxxxx> wrote in message
news:ev5a4h$dt4$2@xxxxxxxxxxxxxxxxxxxxxxxx

Hi "cr88192",

... the group is human moderated (vs. machine moderated),

Be aware, Chuck posts here in ALA in the living form! :)


ok.


[..]
First, I'd keep the semicolon as the "old standard" comment
delimiter and use the separator "|" between instructions.


yes, I just don't like '|' this way, and may want to use it for something
else. also, I am more used to it serving as an operator than a seperator,
with typical seperators being ';', ',', whitespace (often a "very soft"
seperator), and newlines (often a "less soft" or "hard" seperator). to my
eyes, since ',' was already used, and ';' was not (at least as a seperator),
it seemed to make sense.


but, yes, semicolon serves as both comment and seperator...

as noted elsewhere, I already have a good mass of code (maybe 5-10 kloc or
so) that depends on this particular feature (2 different JIT backends), and
it would be too much of a hassle to go and modify it.

note: this kind of character overloading is very common in HLLs, which is
what I am most used to.


[..]
You should avoid automatic code-creation and offer
a few macros instead of inventing new instructions.

But it would be right for the few instructions which
"really need" a sizecast ie:

INCb/w/q/o[mem] instead of: INC very long word pointer [mem]


macros:
would be a hassle to implement (sensible, maybe, if there will be a lot of
human-written code, but very optional for machine-written code, where adding
a few synthetic utility ops may make sense).

if you are asking about inc_r and dec_r, these were because originally, the
opcodes had conflicted with the REX prefix, and at the time I was uncertain
as to how I would distinguish between x86-32 and 64.

later on, I added a feature which was that doing:
inc reg

would be interpreted by the assembler, and if not in x86-64 mode, will be
silently converted to the '_r' forms.

some other cases are similar.

inc word [esi]
inc dword [esi]

is how it is done at present in my case (ptr is optional/ignored).
note that often duplicating opcodes with different names leads to inflation
in the listing files (they are regarded as completely different opcodes, and
are duplicated accordingly).

some cases have been handled this way though, namely where it was ambiguous
(ie: my current assembler can't figure it out).

thus:
movzx and movsz

now have alternative forms:
movzxw and movsxw

could possibly also add:
movzxb and movsxb
as alternatives to the originals (for clarity).


this was not done out of any aesthetic sense, but rather, because I needed
to actually use them and the assembler had a limitation...


[..]
Mmh? Compile in memory?
Where else? :)

You mean immediate compilation with prototype opcode ?
where the programmer immediate can see code-size and format.
This is an interesting attempt as it would help for better
performing coding styles in general.


actually, I mean that, I directly compile/assemble the code, and run it
where it is (vs storing it in object files and passing it off to the
linker). this is why it is needed to auto-link against the host app, so that
eventually I may be able to run dynamicly compiled code just like statically
compiled code (apart from the fact that, sadly, anything pruned out by the
traditional linker is not directly usable).

as such, I am considering specialized object file, and possibly library
loading, where it may be possible to pull new code/data from libraries as
needed (or to simply just link the whole big mass into memory). at least if
I am using this with statically compiled versions of the same libs, the
static versions should get precedence (so I am not ending up with mixed
duplicated and non-duplicated state).


if I were doing the traditional thing, likely I would just use nasm (or
gas).



and a listing fragment (basic syntax):
add
04,ib al,i8
X80/0,ib rm8,i8
WX83/0,ib rm16,i8
TX83/0,ib rm32,i8
X83/0,ib rm64,i8
X02/r r8,rm8

where W/T/X/... tell where prefixes go (Word, DWord, REX).

I've seen it on CLAX, now I think to know its purpose...


the listing is used in my assembler, to autogenerate the tables needed for
doing assembly.

the actual assembler itself knows hardly anything about the instructions,
only a few possible configurations, the registers, and how to encode certain
structures (such as the ModR/M and SIB bytes, ...), and past this point is
driven largely by tables.

potentially, other things, like the registers and encodings, could also be
moved to listing files, but at present this is not needed (could make sense,
ie, if the plan were to support further-reaching and non-x86 targets, but
more likely if it ever came to that it would make more sense just to write a
new assembler).

however, with some more recent changes now the address-size byte is inserted
automatically in some cases, so it works differently than the way it is done
in the listings (potentially, some esoteric situations could result in
duplicate address bytes, which would be bad).


then again, this would be cases like:
a16 jmp_w foo

likely rule:
avoid manual overrides if at all sensible.

if one types:
mov eax, [di] ;in 32 bit mode
or:
mov ax, [fs:esi] ;in 16-bit mode

at present the assembler should do something sensible (and in these
particular cases, the a16 or a32 prefix is simply redundant).

may clean this up eventually...



__
wolfgang



.



Relevant Pages

  • Re: ///Re: NASM problem
    ... in the RosAsm stantard macros are what I would call real significant and very useful. ... arithmetic at compile time isn't important. ... Seeing the ease of using Herberts dissembler, and how easy it was to compile a com.file with it, I have no doubt he could create just as good an assembler as RosAsm, if he only could see what it would mean. ... these small little details you give up the power of assembly over HLLs. ...
    (alt.lang.asm)
  • Re: 32 bit FORTH ??? Different tack!
    ... that has the words you need to do simple arithmetic and compile words ... you need an assembly-or-C routine that you can put onto the new system ... cross-compiler is mostly that you use a different assembler (or ... If you wanted to port Pygmy to a different processor, ...
    (comp.lang.forth)
  • Need help here.
    ... Iam trying to install hla and masm32 as given in Randall Hyde's book ... 'art of assembly windows version' but i cant get it right. ... HLA (High Level Assembler - FASM back end, ... -sm Compile to MASM source files only. ...
    (comp.lang.asm.x86)
  • Re: WinAVR : problem for getting extcoff format for AvrStudio 4
    ... modify the example makefile so that it will contain your C & assembler ... # make program = Download the hex file to the device, using avrdude. ... # List Assembler source files here. ... # the following magic strings to be generated by the compile job. ...
    (comp.arch.embedded)
  • Re: Assembling Visual Studio generated listing files
    ... support assembling of the listing files. ... I wonder how the compiler itself works internally. ... compatible assembler file that can output ...
    (microsoft.public.vc.mfc)