Re: Replace accented chars with unaccented ones
From: Michael Hudson (mwh_at_python.net)
Date: 03/16/04
- Next message: Tim Golden: "Re: Extracting info from OS/hardware"
- Previous message: Fuzzyman: "Re: Replace accented chars with unaccented ones"
- In reply to: Jeff Epler: "Re: Replace accented chars with unaccented ones"
- Next in thread: Noah: "Re: Replace accented chars with unaccented ones"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 16 Mar 2004 11:15:30 GMT
Jeff Epler <jepler@unpythonic.net> writes:
> You have two options. First, convert the string to Unicode and use code
> like the following:
>
> replacements = [(u'\xe9', 'e'), ...]
> def remove_accents(u):
> for a, b in replacements:
> u = u.replace(a, b)
> return u
>
There must be some more high powered way of doing this... something
like:
def remove_accent1(c):
return unicodedata.normalize('NFD', c)[0]
def remove_accents(s):
return u''.join(map(remove_accent1, s))
?
Cheers,
mwh
-- We've had a lot of problems going from glibc 2.0 to glibc 2.1. People claim binary compatibility. Except for functions they don't like. -- Peter Van Eynde, comp.lang.lisp
- Next message: Tim Golden: "Re: Extracting info from OS/hardware"
- Previous message: Fuzzyman: "Re: Replace accented chars with unaccented ones"
- In reply to: Jeff Epler: "Re: Replace accented chars with unaccented ones"
- Next in thread: Noah: "Re: Replace accented chars with unaccented ones"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|