Re: XML Not good for Big Files (vs Flat Files)



Oliver Wong wrote:

Is this even possible? Wouldn't the escaping mechanism depend on what
the punctuations of the file format are?

I don't see why not. There are several broad categories of encoding[*]
techniques.

([*] don't take the word "encoding" to imply that the format is not normally
readable.)

One simply requires that the text format is self-delimiting and that /any/ text
should be interpreted according to the rules of the encoding. So the syntax of
the context is irrelevant. E.g. a length prefix, or a strong quoting
convention like the 'xyz' strings in Unix Bourne shell and its derivatives.

Another possibility is similar, but the encoding is parameterised. For instance
a C-like escape mechanism could be parameterised on
the Start character (defaults to ")
the End character (defaults to Start)
the Escape character (defaults to \)
the range of characters that need to be escaped (defaults to End and Escape
itself).

Another set of possibilities are like URL-encoding or the numerical character
entities in XML/HTML (I may have the name wrong, I mean things like &2345; but
not $amp;). In this case the mechanism is necessarily parameterised on the
surrounding format, since that determines what /has/ to be escaped.

And so on. My point is that it /could/ have been done (a "best practise" RFC
perhaps). Sad that it was not...

-- chris


.



Relevant Pages

  • Re: How to record time in Excel in military (0730) format?
    ... David Biddulph ... The help is not explaining this and it looks to me as an escape character ... I was too much fixed on the format and not enough ...
    (microsoft.public.excel.worksheet.functions)
  • Re: XML Not good for Big Files (vs Flat Files)
    ... (don't take the word "encoding" to imply that the format is not normally ... should be interpreted according to the rules of the encoding. ... a C-like escape mechanism could be parameterised on ... the End character ...
    (comp.lang.java.programmer)
  • Re: How to record time in Excel in military (0730) format?
    ... Tnx for your explanation and the second example. ... The help is not explaining this and it looks to me as an escape character ... I was too much fixed on the format and not ...
    (microsoft.public.excel.worksheet.functions)
  • Re: Labeling the xtics
    ... My x-achsis shows values between 0% and 100%. ... "Bad format character". ... I also tried it with the typical escape character format: ...
    (comp.graphics.apps.gnuplot)
  • Prevent Blank Records being written. Need Help.
    ... Secondary record format is as follows: ... character text field, ... no input mask requirement ... allow zero length = yes. ...
    (microsoft.public.access.forms)