Re: 8051 assembler in Common Lisp



Greg Menke <gregm-xyzpdq@xxxxxxxxxxxx> writes:

> Peter Seibel <peter@xxxxxxxxxxxxxxx> writes:
>
>
>> Greg Menke <gregm-xyzpdq@xxxxxxxxxxxx> writes:
>>
>> > But I intern top-level symbols at read time but do the labels in the
>> > final link pass once all addresses are known. I add an item to each
>> > top-level symbol's property list. The symbol-value is set to the linked
>> > address for all symbols so the address arithmetic will work.
>>
>> Can you show us the code that "intern[s] top-level symbols at read
>> time"? Unless you wrote your own reader, the reader is already
>> interning the symbols which makes my wonder what you're really
>> doing. Also, a symbol doesn't need to be interned to have it's
>> symbol-value set. So, basically, what you're saying doesn't make
>> any sense. That may be because you're talking about something
>> sensical in a slightly incorrect way or because what you're doing
>> actually doesn't make complete sense.
>
> Sure. Once the assember is working "industrially" enough to compile
> some projects so I can test it on real hardware, I'll post the full
> code. Right now its changing quickly and I still need to generate
> .lis files and symbol crossreferences.

Okay. Meanwhile, if you're interested I have some comments on what you
show here. See below.

> Some terminology;
>
> sym is always bound to a top-level symbol
>
> syminst is the compiler state data struct recorded in each top-level
> symbol's property list
>
> (sym-labels) accessor that gets/sets the list of labels contained in
> a top-level symbol
>
> (sym-data) accessor that gets/sets the non-evaled list of sexps
> forming the code in a top-level symbol
>
> (sym-compiled) accessor that gets/sets the list of eval'ed sexps of
> a top-level symbol.
>
> (sym-linked) accessor for the output of the compiler; a sequence of
> bytes comprising the compiled and linked code
>
>
>
> These macros handle creating and interning the top-level symbols at
> read time;

This sentence doesn't really make sense. Macros operate *after* read
time. So it doesn't really make sense to talk about macros doing stuff
at read time.

> (defmacro with-sym-setup ((name opts symtype rest) &body body)
> `(let ((sym (intern (string ',name)))

The previous line is likely not having any effect at all. Note that:

(eql (intern (string symbol)) symbol) ==> T

as long as (eql (symbol-package symbol) *package*). In other words, if
*package* hasn't changed value between the time the whose value is in
NAME was read and the macro-expansion of WITH-SYM-SETUP happens,
(intern (string ',name)) is just going to return the value of
NAME. You might as well have said:

(let ((sym name)) ...)

or for that matter:

(defmacro with-sym-setup ((sym opts symtype rest) &body body) ...)

> (syminst (make-instance ,symtype )))
>
> (proclaim '(special ,name))
>
> ;; symbol-value is the symbol's linked address
> (setf (symbol-value sym) nil)

The previous two lines seem hinky to me. For one thing the
proclemation has a global effect so is not very friendly. And it's
hard to imagine that it's actually necessary. (Though it may be
necessary given the way the rest of the current code works.)

> ;; add syminst to the symbol's prop list
> (setf (get sym 'syminst) syminst)

This seems fine. It's appropriate for a compiler to hang information
about names off the name's plist.

> ;; and init the syminst fields
> (setf (sym-name syminst) (symbol-name sym))
> (setf (sym-address syminst) nil)
>
> (setf (sym-org syminst) (getf ',opts :org nil))
> (setf (sym-align syminst) (getf ',opts :align nil))
>
> (setf (sym-data syminst) ',rest)

This is all fine though it might be more obvious what's going on if
you'd define :initargs for these slots in the classes you may be
instantiating. Then you could write this:

`(let ((syminst (make-instance ,symtype
:name (symbol-name sym)
:address nil
:org (getf ',opts :org nil)
:align (getf ',opts :align nil)
:data ',rest)))
..)

and get rid of the SETFs.

> (setf asm51::*cursymbol* syminst)

I'm wondering why you're seting this rather than binding it.

> ,@body ))
>
> (defmacro deftext (name opts &rest rest)
> `(asm51::with-sym-setup (,name ,opts '_textsym ,rest)
> (push sym asm51::*textsyms*)))

I wonder why you're package-qualifying the symbols with-sym-setup and
*textsyms*. Aren't these macros defined in a file with an (in-package
:asm51) form? (I point this out because it's a may be a symptom of
another flavor of confusion about the relation between read time and
macroexpand time and how they both relate to packages.)

> (defmacro defdata (name opts &rest rest)
> `(asm51::with-sym-setup (,name ,opts '_datasym ,rest)
> (push sym asm51::*datasyms*)))
>
> (defmacro defbss (name opts &rest rest)
> `(asm51::with-sym-setup (,name ,opts '_bsssym ,rest)
> (push sym asm51::*bsssyms*)))
>
>
> via this loop in pass #1 (read pass)
>
> (loop for e = (read str nil nil)
> while e
> do
> (eval e))

I'm confused what the relationship between this loop and the macros
shown above is? Are you using this loop to read a series of forms from
a file where the top-level-forms read are likely to be DEFTEXT,
DEFDATA, and DEFBSS forms? If so, why don't you just use LOAD to load
the file? Then you could also use COMPILE-FILE to compile the files
down to something that already has the macros expanded, etc. and will
LOAD faster. (BTW, this use of EVAL, if I've understood things
correctly, is fine since it's the level of evaluation the LOAD would
normally do for you.)

> In pass #2 (top-level symbol sexp-by-sexp code eval), I identify and
> record each label symbol but do not intern it;

So the interning is still a red-herring. What do you think interning
does?

> (loop for se in (sym-data syminst)
> for e = nil
> with rv = nil
> do
> ;;
> ;; evaluate the sexp, accumulate non-nil results in rv
> ;;
> (cond ((symbolp se)
> ;; se is symbol, make a label out of it
> (setf e (make-label se))
> ;; save it in the labels list
> (push e (sym-labels syminst)) )
>
> (t
> ;; not a symbol, eval it
> (setf e (eval se)) ) )

This seems, if you'll pardon me being blunt, like pretty much a
canonical example of misuse of EVAL in a macro. The value of se was
some code that was in the body of, say, a DEFTEXT form. If you want
that code to be evaluated, you should arrange for it to be put into
the expansion of DEFTEXT so it will be evaluated. Of course in your
case that's not a simple change because you've, as far as I can tell,
turned the whole macro-expansion machinery inside out.

> Now later on in pass #5 (link pass), I intern & set the value of the
> label sybols within each top-level symbol.

Again, what do you think you're accomplishing by interning these
symbols. They're already interned. They've been interned ever since
they were read. And even if they weren't interned, you could still do
everything with them that you're doing anyway.

> I do it here so the same labels can be used in different top-level
> symbols, but are only defined within the top-level symbol- sort of a
> "local" label.
>
>
> ;; define all the labels in this symbol so the code-gen eval
> ;; can be done
> (loop for e in (sym-labels syminst)
> for name = (lab-name e)
> do
> (eval `(progn
> (intern (string ',name))
> (proclaim '(special ,name))
> (setf ,name (lab-address ,e)))) )
>
>
> and after linking all the code in the top-level symbol, I un-intern the labels;
>
>
> ;; symbol is compiled, release all its labels
> ;;
> (loop for e in (sym-labels syminst)
> for name = (lab-name e)
> do
> (eval `(unintern ',name)) ) )

This last EVAL is quite odd--why not just write (unintern name). But
better yet, don't bother. This whole interning/uninterning business is
not doing what you think it's doing. (Actually, thanks to the UNINTERN
it's sort of doing something like what you want--once you've
uninterned the symbol, you've gotten rid of the actual symbol that was
read by the reader so that when you compile the data linked to
subsequent top-level names, the interning will create new symbols
which will be independent of the old symbols of the same name. But
that also means that over the course of your compiler running you have
many different symbols with the same name. Which seems like a recipe
for madness. I'm still not sure how all the pieces of your system fit
together but I'm pretty convinced that there's a much simpler way to
skin this cat.

To provide some food for thought, here's a sketch of part of an
assembler that works somewhat like I think yours ought to. Basically I
define a macro DEFTEXT which can be used to define an association
between a name and a snippet of machine code. The idea (which I'm sort
of guessing at from the code you've shown us) is that such as snippet
is then combined with other snippets into a final assembly. This
sketch doesn't include the ability to include arbitrary Common Lisp
code within a DEFTEXT body but only because I'm not sure how you're
envisioning that being used. At anyrate, note how there's no interning
or uninterning of symbols and no calls to EVAL. After a DEFTEXT form
is evaluated you can use the function GETTEXT to get at the machine
code. For instance you can EVAL the following form (or load a file
containing the form or whatever):

(deftext foo
(nop)
(goto label2)
(nop)
label1 (nop)
(push 128)
(push 255)
(goto label1)
label2 (nop))

then:

ASM51> (gettext 'foo)
#(0 1 11 0 0 2 128 2 255 1 4 0)

Hope this gives you some ideas.

-Peter

--
Peter Seibel * peter@xxxxxxxxxxxxxxx
Gigamonkeys Consulting * http://www.gigamonkeys.com/
Practical Common Lisp * http://www.gigamonkeys.com/book/
.



Relevant Pages

  • Re: 8051 assembler in Common Lisp
    ... >> sym is always bound to a top-level symbol ... evals. ... > a file where the top-level-forms read are likely to be DEFTEXT, ... > So the interning is still a red-herring. ...
    (comp.lang.lisp)
  • Re: Explanation of object equality.
    ... explanation Jeff of interned strings. ... interning isn't really relevant in the above. ... Surely the "interning engine" in the compiler ... I don't see any problem with string interning being in the language - ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: interning strings
    ... > Peter has explained all this correctly, ... This interning is done in the compiler, ... > when the code object is created, so strings not created by the compiler ...
    (comp.lang.python)