Re: Assembler in Lex/Yacc
- From: "randyhyde@xxxxxxxxxxxxx" <spamtrap@xxxxxxxxxx>
- Date: 3 Dec 2005 10:05:54 -0800
Sanky wrote:
> Hi there,
>
> I was working on an assembler for X64 architecture. I was wondering
> what are the tradeoffs in designing an assembler using lex or re2c and
> yacc?
Been there, done that.
HLA v1.x, as a prototype, was written with Flex and Bison. Biggest
mistake I've ever made in my software engineering lifetime. Like you,
it seemed reasonable to me when I started the project back in 1996. By
1998, nearly 50,000 lines of Bison code later, I'd discovered that I
was saving absolutely *nothing* by using Flex/Bison and the costs were
very high.
> Why is that handwritten assemblers are more popular than those
> developed using lex/yacc?
Largely because most people who write assemblers don't know lex/yacc
:-)
But nevertheless, Bison/Yacc and Flex/Lex have some substantial
liabilities that get in the way of using them to write a decent
assembler:
1) Bison/Yacc is *infamous* for having poor error-recovery
capabilities. If you want your assembler to produce meaningful error
messages (as I do for HLA, which is intended for beginners), this is a
sufficient roadblock in and of itself.
2) Flex/Bison (Lex/Yacc) make it very difficult to write a
"compile-time" language (i.e., macros, conditional assembly, and stuff
like that). Not impossible, but difficult. Though HLA has, perhaps, the
most powerful compile-time language around, it was difficult to
implement, a total kludge, and is *very* low performance. Indeed, most
of the complaints people have about HLA's compile speed are directly
attributable to the processing of compile-time language statements.
3) Flex/Bison (Lex/Yacc) are great for *small* compilers. But as
neither tool provides the ability for separate compilation, you can
wind up with some *really* large source files if your language is
large. At one time, HLA v1.x's Bison file was in excess of 100,000
lines of code. This is *far* too big for a single source file. (I've
seen shrunken it to about 80,000 lines of code by moving a lot of C
code into separate modules; but the file is still way to large by an
order of magnitude).
> Developing a grammar is the toughest part,
Not really. The grammar for a *typical* assembler is usually quite
simple. One of the reasons I got trapped into use Flex/Bison is because
HLA has a very complex grammar (more so than many HLLs like Pascal). I
figured that Flex/Bison would be a good choice because of this. I was
wrong.
> but once you have a grammar ready, I think the rest of the routines are
> pretty strightforward?
Once you have a grammar, it's not that hard to write a
recursive-descent predictive parser for the grammar. Particularly if
it's a relatively straight-forward grammar.
> Is it that code generated by Lex/Yacc is not as
> efficient?
Actually, the code it generates isn't that bad at all. It's *huge*
(because of all the tables). The only performance problems I've seen
are due to the fact that my particular parser and lexer tables are so
large that they blow the cache away and you take a big performance hit
for that. HLA, however, is a *large* language; YMMV for a simpler
assembler.
> YASM does get close, but again it resorts to hand written code at some
> places.
Understandable. I had to put a lot of hand-written code for the
compile-time language into HLA.
>
> Is it that Lex/Yacc is an overkill or is that Lex/Yacc are not the
> right tools?
For a production-quality assembler, lex/yacc (Flex/Bison) are not the
right tools.
My feelings about using Flex/Bison for HLA are really mixed. On the one
hand, they are completely inappropriate for this type of tool. OTOH,
for a *prototype* (which is what HLA v1.x is), Flex and Bison have been
useful for making some quick changes to things along the way. But like
any prototype, the code needs to be rewritten for the final,
production, version (which is exactly what is happening with HLA v2.0).
Were I to do it all over again, I don't know which way I'd go. I'm
pretty sure I wouldn't use Bison. If I really wanted to use a
compiler-compiler again, I'd probably go with ANTLR. But the main
reason for doing that would be to force myself to learn another tool
(you'd not believe how many intricate things about Bison and Flex I
learned by writing HLA v1.x). OTOH, if producing an assembler was my
main goal, rather than learning a new tool, I think I'd just go with
C/C++ or even assembly language.
> I'm new to compiler and assembler design, and would be
> helpful if anyone can guide.
http://webster.cs.ucr.edu/AsmTools/RollYourOwn/index.html
Cheers,
Randy Hyde
.
- References:
- Assembler in Lex/Yacc
- From: Sanky
- Assembler in Lex/Yacc
- Prev by Date: sin,cos math functions in assembly
- Next by Date: Re: Newbie query
- Previous by thread: Assembler in Lex/Yacc
- Next by thread: Re: Assembler in Lex/Yacc
- Index(es):
Relevant Pages
|