Re: problems with  character

From: John Roth (newsgroups_at_jhrothjr.com)
Date: 03/23/05


Date: Tue, 22 Mar 2005 20:09:55 -0600

I had this problem recently. It turned out that something
had encoded a unicode string into utf-8. When I found
the culprit and fixed the underlying design issue, it went away.

John Roth

"jdonnell" <jaydonnell@gmail.com> wrote in message
news:1111521139.657563.55410@o13g2000cwo.googlegroups.com...
I have a mysql database with characters like   » in it. I'm
trying to write a python script to remove these, but I'm having a
really hard time.

These strings are coming out as type 'str' not 'unicode' so I tried to
just

record[4].replace('Â', '')

but this does nothing. However the following code works

#!/usr/bin/python

s = 'aaaaa  aaa'
print type(s)
print s
print s.find('Â')

This returns
<type 'str'>
aaaaa  aaa
6

The other odd thing is that the  character shows up as two spaces if
I print it to the terminal from mysql, but it shows up as  when I
print from the simple script above.
What am I doing wrong?



Relevant Pages

  • problems with  character
    ... I have a mysql database with characters like   » in it. ... trying to write a python script to remove these, ... The other odd thing is that the  character shows up as two spaces if ...
    (comp.lang.python)
  • Re: utf8 and ftplib
    ... I'm still not getting this unicode business. ... and this Python script: ... "Returns a unicode string with all the non-ascii characters from the ...
    (comp.lang.python)
  • Re: x and strings
    ... >>> If it is a unicode string use ... > So the only way is what John Carson wrote. ... The escape sequence for embedded bytes is three characters following the \x ...
    (microsoft.public.vc.language)
  • Re: From python to LaTeX in emacs on windows
    ... > In the file there is international characters like é and ó. ... > I read the file into python as a string and suddenly the characters ... > Second problem: ... convert the unicode string back to a byte sequence. ...
    (comp.lang.python)
  • Re: string.replace non-ascii characters
    ... characters of ordinal value> 127. ... why I had a unicode string though. ... I thought urllib2 always spat out ...
    (comp.lang.python)

Loading