Re: Unicode problems, yet again



Jp Calderone wrote:

You don't have a string fetched from a database, in iso-8859-2, alas. That is the root of the problem you're having. What you have is a unicode string.

Yes, you're right :) I actually did have iso-8859-2 data, but, as I found out late last night, the data got converted to unicode along the way.


Thanks to all who replied so quickly :)

(Does anyone else feel that python's unicode handling is, well... suboptimal at least?)

Hmm. Not really. The only problem I've found with it is misguided attempt to "do the right thing" by implicitly encoding unicode strings, and this isn't so much of a problem once you figure things out, because you can always do things explicitly and avoid invoking the implicit behavior.

I'm learning that, the hard way :)

One thing that I always wanted to do (but probably can't be done?) is to set the default/implicit encoding to the one I'm using... I often have to deal with 8-bit encodings and rarely with unicode. Can it be done per-program?

.



Relevant Pages

  • Re: How do I display unicode-paths?
    ... to be most useful to programmers/end users? ... the terminal uses an encoding *different* from the user's ... Just don't convert the Unicode string into a byte string, ... The problem is that the windows console was using MS CP850, ...
    (comp.lang.python)
  • Re: [ANN] pyxser-1.2r --- Python-Object to XML serialization module
    ... The user may want a different encoding, other than utf-8, it can ... I really meant what I wrote: this is XML. ... serialise the result as a normal Unicode string. ... On the way in, you get a unicode string again, which you can encode to ...
    (comp.lang.python)
  • RE: Setting stdout encoding
    ... check for a unicode string to do ... Here's an output stream encoder I have used. ... so I'd welcome any feedback on it, but it does work for encoding output ... print>> out, nihongo ...
    (comp.lang.python)
  • Re: Linguistically correct Python text rendering
    ... The encoding issue is peripheral to my point; ... start with an Arabic string like "abc" I can get out an Arabic string ... correctly render *any* Unicode string, not just the subsets requiring no ...
    (comp.lang.python)
  • Re: What encoding does u... syntax use?
    ... default encoding is ascii, so it seems to me that that unicode string ... But every concrete representation of a unicode string ... I would have thought that when the Python parser ...
    (comp.lang.python)