encode() question
s1 = "hello"
s2 = s1.encode("utf-8")
s1 = "an accented 'e': \xc3\xa9"
s2 = s1.encode("utf-8")
The last line produces the error:
---
Traceback (most recent call last):
File "test1.py", line 6, in ?
s2 = s1.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
17: ordinal not in range(128)
---
The error is a "decode" error, and as far as I can tell, decoding
happens when you convert a regular string to a unicode string. So, is
there an implicit conversion taking place from s1 to a unicode string
before encode() is called? By what mechanism?
.
Relevant Pages
- Re: WTF? Printing unicode strings
... 'ascii' codec can't encode character u'\xff' in ... unicode string. ... In order to convert unicode strings to byte strings without an ... explicit .encode() method call, Python uses the default encoding which is ... (comp.lang.python) - =?windows-1252?Q?Re=3A_getting_rid_of_=97?=
... on the Unicode version of the "html source code". ... File ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ... (comp.lang.python) - Re: Python 3.1.1 bytes decode with replace bug
... In the original example I decoded to UTF-8 and in this example the ... The problem in your original example, and in the current one, is not in decode(), but in encode, which is implicitly called by print, when needed to convert from Unicode to some byte format of the console. ... But since you're running in a debugger, there's an implicit print, which is converting unicode into whatever your default console encoding is. ... (comp.lang.python) - Re: Python 3.1.1 bytes decode with replace bug
... The problem in your original example, and in the current one, is not in decode(), but in encode, which is implicitly called by print, when needed to convert from Unicode to some byte format of the console. ... and converts *FROM* utf8 string to a unicode one. ... But since you're running in a debugger, there's an implicit print, which is converting unicode into whatever your default console encoding is. ... (comp.lang.python) - Re: help with unicode email parse
... UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 4: ... So "str"%not accept unicode, ... (comp.lang.python) |
|