Re: Replace accented chars with unaccented ones

From: Jeff Epler (jepler_at_unpythonic.net)
Date: 03/16/04


Date: Tue, 16 Mar 2004 08:00:36 -0600
To: python-list@python.org

On Tue, Mar 16, 2004 at 08:26:08AM +0100, Nicolas Bouillon wrote:
> Thank you both for your answer. They works well both very good.
>
> First, i believe i doesn't work, because the error i've made is to
> forgot the "u" for string : u"é". Because my file was already utf-8
> encoded (# -*- coding: UTF-8 -*-), i thinks the "u" is not necessary...
> i was wrong.

When there are non-unicode string literals in a file, they are simply
byte sequences. Take this program, for instance:

# -*- coding: utf-8 -*-
s = "é"
print len(s), repr(s)

$ python bytestr.py
2 '\xc3\xa9'

Jeff



Relevant Pages

  • Re: Interface of the set classes
    ... >> Sequences have order, sets don't. ... neither in Python nor in C++ can you switch freely. ... You make claims, above, about the container templates in C++'s standard ... Python's lists, sets and dicts, and in what aspects precisely? ...
    (comp.lang.python)
  • Re: Why return None?
    ... >> Then python has already deviated from the one obvious way to do it. ... > all the sequences with that property, ... > applying its own design principles! ... That these implications are important is just an implication on the ...
    (comp.lang.python)
  • Re: tuples, index method, Pythons design
    ... Unless you want to introduce a character type into Python ... The properties of strings didn't force the developers to make those ... The same method could then eventually be used in other sequences ... properties where they differ from other sequences it no longer ...
    (comp.lang.python)
  • Re: calling functions across threads
    ... I haven't play with the thread stuff in Python but in general terms ... one should not expect read/write actions to be sequential ... not be surprised at all that the I/O sequences appear to be quite ... > py> lst ...
    (comp.lang.python)
  • Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,wasRe: Bug in slice type
    ... The numbers represent points separating unit intervals and representing the ... minuses; Python rounds down. ... strings and sequences are finite and have a right end also. ...
    (comp.lang.python)