Re: Replace accented chars with unaccented ones

From: Michael Hudson (mwh_at_python.net)
Date: 03/16/04


Date: Tue, 16 Mar 2004 11:15:30 GMT

Jeff Epler <jepler@unpythonic.net> writes:

> You have two options. First, convert the string to Unicode and use code
> like the following:
>
> replacements = [(u'\xe9', 'e'), ...]
> def remove_accents(u):
> for a, b in replacements:
> u = u.replace(a, b)
> return u
>

There must be some more high powered way of doing this... something
like:

def remove_accent1(c):
    return unicodedata.normalize('NFD', c)[0]
def remove_accents(s):
    return u''.join(map(remove_accent1, s))

?

Cheers,
mwh

-- 
  We've had a lot of problems going from glibc 2.0 to glibc 2.1.
  People claim binary compatibility.  Except for functions they
  don't like.                       -- Peter Van Eynde, comp.lang.lisp


Relevant Pages

  • Re: cut & paste text between tkinter widgets
    ... Here's some code that gives a cut-copy-paste pop-up window on all Entry widgets ... This code is released into the public domain. ... Jeff Epler ... def make_menu: ...
    (comp.lang.python)
  • Re: Numarray: Using sum() within functions
    ... Jeff Epler wrote: ... > def f3: ... there would be a way to use fromfunction() with slices as ...
    (comp.lang.python)
  • RE: co_freevars question
    ... Jeff Epler wrote: ... > co_freevars names variables that come from an enclosing scope that is ... > not module scope. ... > def g: ...
    (comp.lang.python)