cElementTree encoding woes
- From: "Diez B. Roggisch" <deets@xxxxxxxxxxxxx>
- Date: Mon, 20 Feb 2006 11:37:56 +0100
Hi,
I've got to deal with a pretty huge XML-document, and to do so I use the
cElementTree.iterparse functionality. Working great.
Only trouble: The guys creating that chunk of XML - well, lets just say they
are "encodingly challanged", so they don't produce utf-8, but only cp1252
instead, together with some weird name (Windows-1252) for that. That is not
part of the standard codecs module. cp1252 is, of course.
But that won't work for iterparse. So currently, I manually change the
encoding given to utf-8, and use a stream-recoder.
However, I was wondering if I could teach cElementTree about that encoding
name. I tried to register cp1252 under the name Windows-1252, but had no
luck - cET won't buy it.
Any suggestions?
Diez
.
- Follow-Ups:
- Re: cElementTree encoding woes
- From: Fredrik Lundh
- Re: cElementTree encoding woes
- From: Peter Otten
- Re: cElementTree encoding woes
- Prev by Date: Re: Komodo - Will it Lock Me In?
- Next by Date: Re: Python vs. Lisp -- please explain
- Previous by thread: module time
- Next by thread: Re: cElementTree encoding woes
- Index(es):
Relevant Pages
|