Re: Status of ayacc and aflex?



Ludovic Brenta wrote:
Niklas Holsti wrote:
...
I have not used the parsing functions in OpenToken...

I too like the absence of a generator, and I'm used to writing my own
parsers by hand. Did you find that OpenToken helped a lot in doing
that?

It does its job: lexical analysis. The parser gets to look at the tokens one by one; OpenToken provides a function to return the identity of the current token (an enumeration), another function to return the text string of the current token, and a procedure to advance to the next token. That is what I expect of a lexical analyser, and OpenToken gives me that (at least; I haven't really studied it thoroughly to see what else there may be).

As far as I recall, the only wart I have found has to do with the error reporting when the input text has a sequence of characters that does not match any token -- I had to add a special "invalid token" definition (Opentoken.Recognizer.Nothing.Get) to find the line-number and column-number of the erroneous text. A minor detail.

If I try to think of what might be missing, perhaps the main thing is context-dependent lexical analysis: the ability to say, for example, that I expect the next token to be an identifier, so please ignore the definitions of reserved keywords and just consider them identifiers, too. Of course I have designed my own languages not to need this (keywords are really reserved).

As far as I know OpenToken has no general look-ahead facility, but it should not be too hard to build one yourself, on top of OpenToken, if you need it.

Another good point about OpenToken and the absence of a generator phase: one can have several instances of OpenToken in the same application, for different languages, without any clashes of names or data. My application needs that.

A caveat: The amounts of text that my applications scan with OpenToken are small. I have no idea of the scanning speed; it has been quite enough for my needs.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
.



Relevant Pages

  • Re: curve for verbosity in a language
    ... > The grammars of most programming languages ignore whitespace. ... > counting the tokens, many details can even be ignored. ...
    (comp.programming)
  • Re: Token Parser
    ... >> string from stdin and return 1 token at a time. ... First, you are confusing tokens ... It is intended to work with languages that can resolve things on ... lextokenp lexnext; ...
    (comp.lang.c)
  • Re: Sudoku Solver
    ... the complexity of the tokens and how "natural" they are is ... Since Harrop didn't know, we can safely ignore ... It implies that certain languages require folks to work at not-token ...
    (comp.lang.lisp)
  • Re: silly: "spel" instead of "macro"
    ... knowledge it is just token substitution at read-time. ... languages with a typical pre-compiler, the tokens are predefined. ... Lisp, since it has acces to the language at read-time, it can generate ...
    (comp.lang.lisp)