Re: Is there a string function to trim all non-ascii characters out of a string



"silverburgh.meryl@xxxxxxxxx" <silverburgh.meryl@xxxxxxxxx> wrote:

Hi,

Is there a string function to trim all non-ascii characters out of a
string?
Let say I have a string in python (which is utf8 encoded), is there a
python function which I can convert that to a string which composed of
only ascii characters?

Thank you.

Yes, just decode it to unicode (which you should do as the first thing for
any encoded strings) and then encode it back to ascii with error handling
set how you want:

s = '\xc2\xa342'
s.decode('utf8').encode('ascii', 'replace')
'?42'
s.decode('utf8').encode('ascii', 'ignore')
'42'
s.decode('utf8').encode('ascii', 'xmlcharrefreplace')
'&#163;42'
.



Relevant Pages

  • Re: Question concerning Unicode and or Shift-JIS
    ... so Im a newb python programmer and I'm trying to create a simple python ... I downloaded CJKcodecs for python to encode a string ... > from Shift-JIS to UTF, but either way I don't know how to decode any of the ... If s is a string encoded in UTF-8, converting it in Shift-JIS will be something ...
    (comp.lang.python)
  • RE: japanese encoding iso-2022-jp in python vs. perl
    ... encode as iso-2022-jp before sending it out to the world. ... Is that a utf-8 string, ... Another possible thing to look at is whether your Python output terminal can ... print Japanese OK. ...
    (comp.lang.python)
  • Re: Sending floats over a client-server in Smalltalk
    ... The trick is knowing what to decode them ... Then encode the number in the remaining bytes. ... ByteString>>floatAt: byteIndex ... I could then take a string ...
    (comp.lang.smalltalk)
  • Re: CCertAdmin.SetCertificateExtension
    ... > You must determine how the extension should be encoded and perform that> encoding prior to setting varExt.bstrVal and calling> SetCertificateExtension -- and you must then specify PROPTYPE_BINARY, ... > http://wp.netscape.com/eng/security/cert-exts.html appears to describe the> expected encoding as IA5 string. ... > You can use CryptEncodeObject to encode IA5 strings. ...
    (microsoft.public.platformsdk.security)
  • Re: high and low bytes of a decimal
    ... If you're trying to fit integers into a bytestream I'm guessing ... you need to encode your integers into a string ... Chances are you're going to want to use big-endian order, ...
    (comp.lang.perl.misc)