Re: Code review comment: "Noone uses tokens anymore"
- From: websnarf@xxxxxxxxx
- Date: 6 Nov 2005 10:45:30 -0800
Kevin J. Phillips wrote:
> Ertugrul Soeylemez wrote:
> > As already said, I'm also pretty sure that he means some well-used
> > configuration file format. He might mean that you shouldn't
> > hand-tokenize, but instead use something like XML or a scripting
> > language. I for myself prefer to use TCL for configuration files and
> > XML or binary for data files, whichever is more appropriate.
>
> Yep, you guys were right.
>
> OK, I had a sit down with him to go over some of his comments. When I
> asked him to clarify his statement, he said that tokenizing and parsing
> was an overly complex solution to the problem of reading the data file.
> I should either use a combination of fgets() and sscanf(), or switch to
> XML.
Indeed, sscanf() doesn't have any parsing capabilities that are useful
in the real world. And fgets() is not a solution, in of itself, to the
problem of consuming input (see
http://www.pobox.com/~qed/userInput.html).
> I strongly disagree with his first comment. There's no way fgets() and
> sscanf() could properly handle some of the combinations that are
> possible. I don't think my solution was very complex. It was only
> three functions: one to tokenize, one to parse, and one to save the
> result in memory.
>
> The second part of his comment, to consider using XML, is worth looking
> at. I am mulling it over...
I am told that completely correct XML parsing is horrendously
difficult. But if your format is just some tiny subset of XML, I'm
sure that could work. It puts an extra burden on the configuration
file authors though.
> BTW, the configuration file is pretty simple. It's just a file with
> lines like below:
>
> ; Comment lines begin with a semi-colon
> var = 3.141
> var2 = plugh
> var3 = "message ; hello" ; Inline comment
> var4 = 'When I saw him, I said "Hello!"'
Ok, then your reviewer doesn't understand the problem. You need to find
some way of handling the whole quote in quotes thing, and no matter
what you are going to do some sort of tokenization to deal with this.
If you did this explicitely as tokenization, then parsing, then that's
fine.
Perhaps the reviewer doesn't like the inevitable complexity of such
code, but there's little you can do about it, short of using yacc or
bison or whatever (in which case you are just relying on someone else's
complex code, rather than your own.)
The value of XML, would be that you can specify the types of your data.
Is that what the reviewer is really worried about? For example, do
you actually need to deterine that var is a float, var2 is some sort of
symbolic, and var3 and var4 are strings are the output of your parser?
--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
.
- Follow-Ups:
- References:
- Code review comment: "Noone uses tokens anymore"
- From: Kevin J. Phillips
- Code review comment: "Noone uses tokens anymore"
- Prev by Date: open source: ASD project
- Next by Date: Re: Code review comment: "Noone uses tokens anymore"
- Previous by thread: Re: Code review comment: "Noone uses tokens anymore"
- Next by thread: Re: Code review comment: "Noone uses tokens anymore"
- Index(es):
Relevant Pages
|