Re: extract config file's tokens: need advice on algorithm



Roman Mashak wrote:
> Hello, All!
>
> I have to parse 'named' config file (DNS server). For those who don't know
> of it, its structure looks like this:
>
> # comment
> <statement> {
> <option>;
> <options>;
> ...
> };
>
> There are a number of statements, but I'm interested in one only:
>
> zone "name_of_zone" {
> <zone_options>;
> ...
> };
>
> So what I want is some pretty fast algorithm (right now size is small, but
> it's gonna grow up quickly) running through this file, and return TRUE if
> "name_of_zone" found, FALSE otherwise.

That doesn't look too hard. From what you say the language is
something like

file = {expression || comment}*
comment = # everything-until-end-of-line
expression = statement-part '{' option-list '}' ';'
statement-part = statement argument
option-list = {option ';'}*

There seem to be several types of token, say
LBRACE == {
RBRACE == }
SEMI == ;
STRING == "something"
IDENTIFIER == zone etc

I would firstly make a lexer to read the above tokens either by hand or
with something like flex. I'd kill comments in the lexer. I would
then write a recursive descent parser to read the above grammar, which
means fairly much writing a subroutine corresponding to each BNF line
above. Doing it this way should be very fast.

There are plenty of sources on the web for writing lexer and writing
recursive descent parsers. You may be tempted to do it in an ad-hoc
way, I would avoid that, ad-hoc parsing code is harder to debug than
parsing code written in the normal structured way.

Since 'named' is quite a common file to parse I expect you will be able
to find code to do it on the internet.

.



Relevant Pages

  • Re: Can Coco/R do multiple tokenizations
    ... there are no existing lexer gen tools which allow alternate ... > of trying a particular parse, saving the AST if the parse succeeds, ... > then backtracking, switching lexers and trying the same parse again. ... parse poorly defined languages such as COBOL and PL/1. ...
    (comp.compilers)
  • Re: test for empty stack
    ... actually represents is to parse it, ... I wanted to trap the error of no input ... Given the Subject: test for empty stack ... "I am writing a program which requires one input value. ...
    (comp.sys.hp48)
  • Re: Reading data from port
    ... I suspect the problem may be in your receive and parsing code, ... and parse out of that array. ... ReadByte method instead of the Read method... ...
    (microsoft.public.dotnet.general)
  • weird math
    ... I am writing a program to parse a CSV file downloaded from my bank. ... keep a running balance, but I'm getting a weird total. ...
    (perl.beginners)
  • Re: recent file
    ... >> I'm writing a vb.net console project to parse through the NTBackup log ...
    (microsoft.public.dotnet.languages.vb)