Re: How to display unicode with the CGI module?



paul wrote:
However, this will change in py3k..., what's the new rule of thumb?

In py3k, the str type will be what unicode is now, and there
will be a new type called bytes for holding binary data --
including text in some external encoding. These two types
will not be compatible.

At the lowest level, reading a file will return bytes, which
then have to be decoded to produce a (unicode) str, and a str
will have to be encoded into bytes before being written to a
file.

There will be wrappers for text files that perform the
decoding and encoding automatically, but they will need to
be set up to use a specified encoding if you're dealing
with anything other than ascii. (It may be possible to
set up a system-wide default, I'm not sure.)

So you won't be able to get away with ignoring encoding
issues in py3k. On the plus side, it should all be handled
in a much more consistent and less error-prone way. If
you mistakenly try to use encoded data as though it were
decoded data or vice versa, you'll get a type error.

--
Greg
.



Relevant Pages

  • Re: unicode question
    ... sys.stdout really is a byte string, ... then sys.stdout really does not *have* an encoding - but it still ... "print" should only be used if the stream is meant to ... In P3k, this part of the issue will go away, as str() then will ...
    (comp.lang.python)
  • Re: unicode(obj, errors=foo) raises TypeError - bug?
    ... Notice that the "right" thing to do would be to pass encoding and errors ... As for using encoding and errors on the result of str() conversion ... it always uses ASCII, no matter what the system encoding is). ...
    (comp.lang.python)
  • Re: unicode(obj, errors=foo) raises TypeError - bug?
    ... > if they suddenly get an encoding argument). ... > As for using encoding and errors on the result of str() conversion ... usually by careful detective work on the source of the string ... To assume that in absence of any guidance, sure, that is consistent. ...
    (comp.lang.python)
  • Re: length of binary data
    ... bug to be using the identity encoding. ... My understanding is that [encoding convertfrom identity $str] produces ...
    (comp.lang.tcl)
  • Re: How to display unicode with the CGI module?
    ... you mistakenly try to use encoded data as though it were ... decoded data or vice versa, ... I guess implicit encode() of <str> when using printwill stay but having utf-8 as the new default encoding will reduce the number of UnicodeError. ...
    (comp.lang.python)