Re: A Unicode problem -HELP



manstey wrote:
a=str(word_info + parse + gloss).encode('utf-8')
a=a[1:len(a)-1]

Is this clearer?

Indeed. The problem is your usage of str() to "render" the output.
As word_info+parse+gloss is a list (or is it a tuple?), str() will
already produce "Python source code", i.e. an ASCII byte string
that can be read back into the interpreter; all Unicode is gone
from that string. If you want comma-separated output, you should
do this:

def comma_separated_utf8(items):
result = []
for item in items:
result.append(item.encode('utf-8'))
return ", ".join(result)

and then
a = comma_separated_utf8(word_info + parse + gloss)

Then you don't have to drop the parentheses from a anymore, as
it won't have parentheses in the first place.

As the encoding will be done already in the output file,
the following should also work:

a = u", ".join(word_info + parse + gloss)

This would make "a" a comma-separated unicode string, so that
the subsequent output_file.write(a) encodes it as UTF-8.

If that doesn't work, I would like to know what the exact
value of gloss is, do

print "GLOSS IS", repr(gloss)

to print it out.

Regards,
Martin
.



Relevant Pages

  • Re: ISAM error when trying to create tablelink
    ... Dim tDef As DAO.TableDef ... Doug Steele, Microsoft Access MVP ... Dim str As String ...
    (microsoft.public.access.modulesdaovba)
  • Re: Using the Dictionary object
    ... Private Sub LV_ItemCheck ... Dim Str As String ... MyNext = MyNext - 1 ...
    (microsoft.public.excel.programming)
  • Re: str() should convert ANY object to a string without EXCEPTIONS !
    ... For strings, ... 'ascii' codec can't encode character u'\ue863' in ... And it is correct to fail, ASCII is only defined within range, ... If that str() function has returned anything but error on this, ...
    (comp.lang.python)
  • Re: ISAM error when trying to create tablelink
    ... that will work to change an existing similar tabledef.connect string. ... Doug Steele, Microsoft Access MVP ... Dim rs As New ADODB.Recordset ... Dim str As String ...
    (microsoft.public.access.modulesdaovba)
  • Re: input a string in gcc
    ... Basically what my code is supposed to do is accept a string from user ... fgets (str, sizeof str, stdin); ... Think of pointers like of checks - if they aren't backed by ... i-th element of arr to point to the j-th char in str. ...
    (comp.lang.c)