Re: (OT) Q on Unicode and XML



There are techniques for including text in two different writing systems with
different encodings in the same document. In fact, what is perhaps the most
common of them was created to allow Chinese text to be embedded in documents
primarily in English. This is the "HZ escape" method, described in
RFC 1843 (http://www.ietf.org/rfc/rfc1843.txt). I don't see why it would be
impossible to do an XML version of such an approach.

But as Benny Riefenstahl says, why bother? Its easier just to do
it all in Unicode. And as Jeff Hobbs says, why use XML when msgcat will work?

--
Bill Poser, Linguistics, University of Pennsylvania
http://www.ling.upenn.edu/~wjposer/ billposer@xxxxxxxxxxxx
.



Relevant Pages

  • Re: (OT) Q on Unicode and XML
    ... >There are techniques for including text in two different writing systems with ... >different encodings in the same document. ... >common of them was created to allow Chinese text to be embedded in documents ... > And as Jeff Hobbs says, why use XML when msgcat will work? ...
    (comp.lang.tcl)
  • Re: Read binary data file
    ... I think its use is quite industry-dependent: I've never seen it used in financial messaging (that's more likely to use SWIFT formats, which are tagged text) but its common in the telecommunications industry. ... Compared with XML its a LOT more compact (tags are one byte, fixed length fields don't have terminators, variable length fields are preceded by a one or two byte length) and it has a number of predefined field types as well as arrays. ...
    (comp.lang.java.programmer)
  • Re: TCL Dom question
    ... Yes - there are a number of issues concerning encodings. ... Note that Tcl 8 is very good at handling character encodings (see ... there is still a problem when the XML document itself ... character encoding used in the XML Declaration, ...
    (comp.lang.tcl)
  • Re: ASX & french character in title arent recognized
    ... I'll check but I suspect ASX might only support US-ASCII character ... Unlike most XML file formats, it's xml in name only and is not case ... sensitive or suitable for specifying character encodings like ...
    (microsoft.public.windowsmedia.player.web)
  • Re: Help w/Self-Join Hierarchy Query
    ... Thanks, your example code almost does it, and if I delve into EXPLICIT then I ... My complaint/gripe/question is that this seemingly common functionality is ... Then there is the ever popular employee - manager adjacency. ... FOR XML SELF ...
    (microsoft.public.sqlserver.programming)