Re: Mysterious xml.sax Encoding Exception
- From: John Machin <sjmachin@xxxxxxxxxxx>
- Date: Mon, 4 Feb 2008 15:09:48 -0800 (PST)
On Feb 5, 9:02 am, JKPeck <JKP...@xxxxxxxxx> wrote:
On Feb 2, 12:56 am, Jeroen Ruigrok van der Werven <asmo...@in-
nomine.org> wrote:
-On [20080201 19:06], JKPeck (JKP...@xxxxxxxxx) wrote:
In both of these cases, there are only plain, 7-bit ascii characters
in the xml, and it really is valid utf-16 as far as I can tell.
Did you mean to say that the only characters they used in the UTF-16 encoded
file are characters from the Basic Latin Unicode block?
It appears that the root cause of this problem is indeed passing a
Unicode XML string to xml.sax.parseString with an encoding declaration
in the XML of utf-16. This works with the standard distribution on
Windows.
It did NOT work for me with the standard 2.5.1 Windows distribution --
see the code + output that I posted.
It does not work with ActiveState on Windows even though.
both distributions report
64K for sys.maxunicode.
So I don't know why the results are different, but the problem is
solved by encoding the Unicode string into utf-16 before passing it to
the parser.
- Follow-Ups:
- Re: Mysterious xml.sax Encoding Exception
- From: JKPeck
- Re: Mysterious xml.sax Encoding Exception
- References:
- Mysterious xml.sax Encoding Exception
- From: JKPeck
- Re: Mysterious xml.sax Encoding Exception
- From: Jeroen Ruigrok van der Werven
- Re: Mysterious xml.sax Encoding Exception
- From: JKPeck
- Mysterious xml.sax Encoding Exception
- Prev by Date: Spawn new process -> get pid
- Next by Date: Re: Spawn new process -> get pid
- Previous by thread: Re: Mysterious xml.sax Encoding Exception
- Next by thread: Re: Mysterious xml.sax Encoding Exception
- Index(es):
Relevant Pages
|