Re: not quite 1252



Anton Vredegoor wrote:

I'm trying to import text from an open office document (save as .sxw and
read the data from content.xml inside the sxw-archive using
elementtree and such tools).

The encoding that gives me the least problems seems to be cp1252,
however it's not completely perfect because there are still characters
in it like \93 or \94. Has anyone handled this before?

this might help:

http://effbot.org/zone/unicode-gremlins.htm

</F>





.



Relevant Pages

  • Re: not quite 1252
    ... Anton Vredegoor wrote: ... elementtree and such tools). ... The encoding that gives me the least problems seems to be cp1252, ... I extracted content.xml from a test file and the header is: ...
    (comp.lang.python)
  • Re: Parsing XML with ElementTree (unicode problem?)
    ... I'm trying to parse an XML file ... function) can handle this type of encoding since I obtain my xml file ... from elementtree import ElementTree ... root = ElementTree.parse) ...
    (comp.lang.python)