Re: s-expression data language



bob_bane wrote:
As the guy who appears to have launched the "XML is s-expressions in
drag" comment, I'll take this one.

There is no spec for "lisp read syntax". There are detailed specs for
Common Lisp read and Scheme read, and implementations that purport to
implement those specs are compatible with each other.

Unfortunately they're not really completely compatible with each other.
Some CL and some scheme implementations expand and change read syntax
is ways that go beyond the standard.

If you restrict
your s-expressions to ASCII atoms, double-quoted strings, integers, and
floating-point numbers, they'll be readable by just about anything.

Yes. I think the approach of having a separate language, like the
s-expression language Antonio mentions below might solves the problem
better though. It allows for more flexibility than just using the
simple atoms you mention.

I haven't found much use for validation, for s-expression-based
messages or XML documents.

Interesting. Currently I'm writing a program that uses s-exprs to
store and recall data. Since I didn't know about Rivest s-expression I
decided to take the approach you mention of using only simple atoms.
It works OK.

I found that I wanted some level of verification though in order to
give sensible error messages at sensible times. The program reads the
s-exprs when a user opens a file and does simple validation. Then
later after more user input it walks the lists extracting data and
processing it.

If it didn't do simple validation I would have to issue errors about
malformed files when they were processed, which would be confusing. It
would also complicate the processing code with more error reporting
code.

XML validation systems are somewhat useful
as documentation, but they tend to enforce non-useful constraints (it's
hard to say in a DTD that I want these 5 elements, in any order - you
end up saying they must come in one specific order) while not letting
you check simple-sounding but important constraints (like each of these
elements must have a corresponding element somewhere else). You end up
changing trivial stuff to make the validator happy, then feeding the
result to your program and having to fix other stuff anyway.

I'm sure that's true. But theres no reason why a validation system
should have the limitations of DTDs. I much better validating system
could be made. It certainly wouldn't be difficult at all to make a
validation system a *little* better.

XML does do some things s-expressions can't do well, most notably
handling extended and multiple character sets well. Individual
implementations of CL/Scheme handle those, but there isn't overarching
spec support for much outside ASCII.

This is a problem. Maybe it would be easier to solve by using a
separate format than the lisp/scheme languages s-exprs like the
s-expression format?

I think a lot of lisp/scheme implementations could support Unicode in
their readers and printers if they wanted too. The problem they have
is that doing so implies the need to support it everywhere else too,
which is harder.

If there were a spec for
character set support, it wouldn't be hard to define an s-expression
document format - one s-expression up front restricted to ASCII that
was the moral equivalent of <?XML ... ?> followed by a payload
s-expression.

That's a good idea. In the program I'm currently writing each file
begins ..
(version 1)

It would also be interesting to see if a format could be devised where
a lisp language is a specific application. That is, for example Scheme
s-exprs can be considered to be format X with rules Y.

.