Re: utf8 silly question
- From: "Grig Gheorghiu" <grig.gheorghiu@xxxxxxxxx>
- Date: 21 Jun 2005 10:00:08 -0700
Salut, Catalin
You can first convert your c string to unicode, and in the process
specify an encoding that understands non-ASCII characters (if you don't
specify an encoding, it will try to use your default, which is most
likely ASCII, and you'll get the error you mentioned.). In the
following example, I specified 'iso-8859-1' as the encoding.
Then you can utf8-encode the c string via the codecs module.
Here's a snippet of code (note the error when I don't specify a
non-default unicode encoding):
Python 2.4 (#1, Nov 30 2004, 16:42:53)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> c = unicode(chr(169)+" some text")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa9 in position 0:
ordinal not in range(128)
>>> c = unicode(chr(169)+" some text", 'iso-8859-1')
>>> print c
© some text
>>> import codecs
>>> print codecs.encode(c, 'utf-8')
© some text
.
- Follow-Ups:
- Re: utf8 silly question
- From: Steven Bethard
- Re: utf8 silly question
- References:
- utf8 silly question
- From: Catalin Constantin
- utf8 silly question
- Prev by Date: Re: smtplib and TLS
- Next by Date: Re: Embedded Systems Python?
- Previous by thread: utf8 silly question
- Next by thread: Re: utf8 silly question
- Index(es):
Relevant Pages
|