Re: strange behaviour of str()



Juho Vuori wrote:

> str(u'lää') raises UnicodeEncodeError

> Is this behaviour sane? Possibly, but not documented at all.

str() on a Unicode string attempts to convert the string to an 8-bit
string using Python's default encoding, which is ASCII. "ä" is not
an ASCII character.

if this problem appears in a 3rd party program, that program has
not been properly internationalized.

> Somehow you'd expect str() to never fail.

except for id() and type(), virtually all builtins can fail. If you want
to convert something to a string no matter what it contains, repr() is
a better choice. If you want to convert Unicode strings to a given
byte encoding, you have to use the encode method.

</F>



.



Relevant Pages

  • Re: Documentation for "str()" could use some adjustment.
    ... Return a string containing a nicely printable representation of an ... there's no mention of the fact that "str" of a Unicode string ...
    (comp.lang.python)
  • RE: Asking again - ResultSet.getString() & unicode
    ... Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ... I can get the unicode string correctly from sqlserver. ... T-SQL, when we inserting unicode string, it's recommended that we add "N" ...
    (microsoft.public.sqlserver.odbc)
  • Linguistically correct Python text rendering
    ... Unicode strings in Python? ... let's say I create this Unicode string in Arabic: ... is not correct rendering behavior for Arabic. ...
    (comp.lang.python)
  • Re: Microsoft Layer for Unicode on Windows 95/98/Me systems
    ... >> then back to ANSI again for the API call. ... >> unicode string that VB is using. ... is pointing to a unicode string. ... > creates toolbar/menu and inserts code), and installs the referenced DLL. ...
    (microsoft.public.vb.winapi)
  • Re: UnicodeDecodeError help please?
    ... do not hack the default encoding. ... looks like ISO-8859-1 (Latin-1) to me. ... so it's an ISO Latin-1 string. ... you're trying to combine an 8-bit string with a Unicode string, ...
    (comp.lang.python)