Re: 'ascii' codec can't encode character u'\xf3'

From: oziko (oziko_at_fusiondementes.com)
Date: 08/17/04


Date: Tue, 17 Aug 2004 10:34:18 -0500
To: python-list@python.org

I solve the problem using

print str.encode('iso-8859-1')

Now I can print the tags with no aparent problem. But now whe I tried to
insert that value into a PostgreSQL data base I get the same error. I
create the PostgreSQL database with default Unicode with

createdb -E UNICODE oggtest

The data T am putting into de database si in the u'Perfeccion' format so
I understand it is UNICODE, but I get the same error:

Traceback (most recent call last):
   File "./ogg2sql.py", line 82, in ?
     db_cursor.execute(do)
   File "/usr/lib/python2.3/site-packages/pyPgSQL/PgSQL.py", line 3035,
in execute
     _qstr = self.__unicodeConvert(_qstr)
   File "/usr/lib/python2.3/site-packages/pyPgSQL/PgSQL.py", line 2740,
in __unicodeConvert
     return obj.encode(*self.conn.client_encoding)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in
position 102: ordinal not in range(128)

my insert query is:

tracks_insert_values =(unicode(coments['TITLE']),coments['TRACKNUMBER'])

y also tried with:

tracks_insert_values=(coments['TITLE'].encode('utf-8'),coments['TRACKNUMBER'])

insert_query = '''insert into tracks(titulo,no_pista)values(%s %i)''' %
tracks_insert_values

Martin Slouf wrote:
> i had similar errors:
>
> Traceback (most recent call last):
> File "/home/martin/skripty/accounts.py", line 125, in ?
> main(sys.argv)
> File "/home/martin/skripty/accounts.py", line 119, in main
> print_accounts(accounts, url_part)
> File "/home/martin/skripty/accounts.py", line 94, in print_accounts
> print str(i).encode("utf-8", "replace")
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 151-152: ordinal not in range(128)
>
> - - - -
>
> the solution seems to be:
>
> 0. string is not in unicode encoding (assumption)
> 1. before printing out, convert the string to unicode
> 2. when printing, convert to whatever charset you like
>
> though i dont understand much why (ive solved it a minute ago :) the
> code should be:
>
> str = "any nonunicode string"
> print unicode(str).encode("iso-8859-2", "replace")
>
> comments:
>
> 1. why the string is not in unicode can have several reasons -- i guess:
> - does ogg stores tags in unicode?
> - you have parsed an xml file with encoding attribute set (that
> is what i do)
> - etc
>
> 2. "replace" parameter in encode causes non-printable chars to be
> replaced with '?' (you can use "ignore" or strict", see your python
> doc)
>
> 3. the above will work _only_ _if_ the 'str' encoding is "iso-8859-2" --
> a funny thing -- first line of code converts from unknown (but the
> programmer must know it) to unicode and the second one converts it back
> from unicode to unknown (now the programmer tells that secret to python
> :)
>
> 4. i would like to know from any python expert whether/why/why not:
>
> * my assumptions are right
>
> * why is that behaviour? -- if you search google you get
> thousands of errors like this -- with no proper solutions i must add
>
> * is there an easier portable way (no sitecustomize.py changes)
> to do it
>
> * i was looking in site.py and there is deleted the
> sys.setdefaultencoding() function, but from the comments i do
> not know why -- you know it? why is user not allowed to change the
> default encoding? it seems reasonable to me if he/she could do that.
>
> thx.
>
> m.
>



Relevant Pages

  • Re: Getting prepared for Unicode
    ... Unicode means WideString, ie a string of 16-bit characters. ... the careful programmer will always be able to work in any kind of framework we set up. ... Let's disregard combining characters and higher level lexical elements at this stage, and leave that for the VCL to handle, not the language itself. ... There's no such thing as UCS-2 ...
    (borland.public.delphi.non-technical)
  • Re: Are _T() and TEXT() macros equivalent?
    ... supporting Unicode. ... So instead the programmer ... changing 'char' such ...
    (microsoft.public.vc.mfc)
  • Re: Getting prepared for Unicode
    ... more straightforward because one char would always contain ... Unicode means WideString, ie a string of 16-bit characters. ... the careful programmer will always be able to work in any kind ...
    (borland.public.delphi.non-technical)
  • Re: WxPython versus Tkinter.
    ... Python3000 auto converts all strings to Unicode strings. ... You've identified one programmer who thinks about ... internationalization. ... trenches who use these tools don't think about these topics. ...
    (comp.lang.python)
  • Re: WxPython versus Tkinter.
    ... Python3000 auto converts all strings to Unicode strings. ... You've identified one programmer who thinks about ... internationalization. ... trenches who use these tools don't think about these topics. ...
    (comp.lang.python)