Re: A Proper Lexer
- From: "Chun Tian (binghe)" <binghe.lisp@xxxxxxxxx>
- Date: Sun, 14 Sep 2008 00:57:19 -0700 (PDT)
I second this. In most case there's no need for regular expression
when doing lexer job.
Months ago I wrote an ASN.1->Lisp compiler [1], the lexer part is
whole implemented by CL readtable. ASN.1 token is quite complicated
but CL readtable is powerful enough.
--binghe
[1] http://sourceforge.net/project/showfiles.php?group_id=209101&package_id=285435
Paul Tarvydas wrote:
akopa wrote:.
Does any one have any advice on doing fast-ish character I/O for a Lex...
like scanner. I've read http://www.emmett.ca/~sabetts/slurp.html,
but I want to present single chars to the scanner.
I was a little surprised to find no equivalent of Lex available for
Common Lisp. There is http://www.geocities.com/mparker762/clawk but
it is string oriented, not stream oriented.
There is little incentive to use Lex in lisp, because the lisp reader *is* a
lexer (tokenizer), hence, lisp is the language least likely to have a
Lex-like library. I've built many small languages and just used what lisp
gives me, e.g. if you can design it so that the little language has a space
between each token, you don't have to write a lexer. I suppose you could
get fancier by employing read macros, et al.
When I write languages that cannot follow the above simplification, I use my
home-grown CL versions of extended S/SL (syntax/semantic language) and my
experimental packrat parser (which describes the construction of tokens and
parses - google for "Bryan Ford Packrat Parser"). I can share these with
you, but they are not Lex-like (nor are they well-documented, but there is
always email :-).
By definition, lexing something a character at a time is, by definition, a
state machine. If you design it with a state machine macro - code executed
on entry, transition and exit - you will not tie yourself up in knots. I
have a state machine macro which I could share.
Lexing is often a simple enough job that you can just roll it by hand using
a few simple primitives (I think I can list them, if you're interested).
Again, the state machine paradigm is the way to go.
As for speed, what are you doing? I've stopped worrying about speed in
lexing and parsing - desktop machines are so fast these days that I seem to
get away with fully-backtracking parsers and a packrat parser with no
packratting and don't even notice.
And if you REALLY want Lex, then I know from direct experience that you can
code one up directly from the algorithms in the Dragon book (I did it about
30 years ago, but my version is lost, or on an 8.5" floppy :-).
It ain't exactly what you asked for, but contact me if any of the above is
of interest.
pt
- References:
- A Proper Lexer
- From: akopa
- Re: A Proper Lexer
- From: Paul Tarvydas
- A Proper Lexer
- Prev by Date: Re: accumulating with loop
- Next by Date: Re: accumulating with loop
- Previous by thread: Re: A Proper Lexer
- Next by thread: Help with packages
- Index(es):
Relevant Pages
|