Re: Code review comment: "Noone uses tokens anymore"



Kevin J. Phillips wrote:
> Ertugrul Soeylemez wrote:
> > As already said, I'm also pretty sure that he means some well-used
> > configuration file format. He might mean that you shouldn't
> > hand-tokenize, but instead use something like XML or a scripting
> > language. I for myself prefer to use TCL for configuration files and
> > XML or binary for data files, whichever is more appropriate.
>
> Yep, you guys were right.
>
> OK, I had a sit down with him to go over some of his comments. When I
> asked him to clarify his statement, he said that tokenizing and parsing
> was an overly complex solution to the problem of reading the data file.
> I should either use a combination of fgets() and sscanf(), or switch to
> XML.

Indeed, sscanf() doesn't have any parsing capabilities that are useful
in the real world. And fgets() is not a solution, in of itself, to the
problem of consuming input (see
http://www.pobox.com/~qed/userInput.html).

> I strongly disagree with his first comment. There's no way fgets() and
> sscanf() could properly handle some of the combinations that are
> possible. I don't think my solution was very complex. It was only
> three functions: one to tokenize, one to parse, and one to save the
> result in memory.
>
> The second part of his comment, to consider using XML, is worth looking
> at. I am mulling it over...

I am told that completely correct XML parsing is horrendously
difficult. But if your format is just some tiny subset of XML, I'm
sure that could work. It puts an extra burden on the configuration
file authors though.

> BTW, the configuration file is pretty simple. It's just a file with
> lines like below:
>
> ; Comment lines begin with a semi-colon
> var = 3.141
> var2 = plugh
> var3 = "message ; hello" ; Inline comment
> var4 = 'When I saw him, I said "Hello!"'

Ok, then your reviewer doesn't understand the problem. You need to find
some way of handling the whole quote in quotes thing, and no matter
what you are going to do some sort of tokenization to deal with this.
If you did this explicitely as tokenization, then parsing, then that's
fine.

Perhaps the reviewer doesn't like the inevitable complexity of such
code, but there's little you can do about it, short of using yacc or
bison or whatever (in which case you are just relying on someone else's
complex code, rather than your own.)

The value of XML, would be that you can specify the types of your data.
Is that what the reviewer is really worried about? For example, do
you actually need to deterine that var is a float, var2 is some sort of
symbolic, and var3 and var4 are strings are the output of your parser?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

.



Relevant Pages

  • Re: Invoking a form that was created on a separate thread
    ... Use XML on a background thread and be on your own in terms of best practise and common supported scenarios ... I actually managed to get it all working now with my controller class ... and then a proper progress bar is shown during the parsing. ... conventional purposes and just needed to report progress to the UI. ...
    (microsoft.public.dotnet.framework.compactframework)
  • Re: Making classes from Metaclasses globally available
    ... > Catch is, said text files are structured much like XML, but they're NOT XML. ... > be looking for and then parsing. ... > basic schema parsing and class creation ... > I'm thinking the parsing functionality will be extra polated from the schema ...
    (comp.lang.python)
  • RE: Need advice on a Data Import Routine
    ... input data or I would be using XML. ... >> detect when the file arives and prep it for parsing. ... The structure breaks out ... >> used a Stored Procedure to break down the information into the tables. ...
    (microsoft.public.sqlserver.programming)
  • parsing xml (xmpp) with ruby
    ... I am writing an XMPP server in Ruby. ... This means I have to do a good deal of XML parsing, ... Right now I am using REXML to parse the individual stanzas as they ...
    (comp.lang.ruby)
  • Re: new lame effort, self-compressed bin-xml...
    ... > open the expansion inline to get the full xml. ... if you are talking about inlining the binary-format in the textual one, ... I am also specifying use of mru for assigning the tag values... ... the main possible advantage here could be faster textual parsing with less ...
    (comp.compression)