Re: Quieter glyphs than parentheses

From: Michael Hudson (mwh_at_python.net)
Date: 02/09/04


Date: Mon, 9 Feb 2004 12:36:34 GMT


"Steven M. Haflich" <smh_no_spiced_ham@alum.mit.edu> writes:

> Some additional work comes in supporting all the external formats needed
> by non ISO8859 language scripts. While Unicode represents pretty much
> everything in a nice, flat, 16-bit code (ignoring that Unicode has
> actually recently overflowed 16 bits -- sigh!) and UTF-8, which is a
> simple space-saving encoding of Unicode, most of the difficult languages
> have one or more different variable-length encodings. For example,
> Japanese has three popular non-Unicode-based encodings, and Lisp
> applications may ultimately need to deal with each. (For example, my
> Japanese wife regularly receives email in _four_ different encodings.)

The way you surely deal with this is to convert text in such encodings
into your internal representation at the earliest opportunity and out
of it again at the last. Code to convert from your internal string
representation to the various codecs isn't the most entertaining in
the world to write, but it's not hard.

Cheers,
mwh

-- 
  I'm about to search Google for contract assassins to go to Iomega
  and HP's programming groups and kill everyone there with some kind
  of electrically charged rusty barbed thing.
                -- http://bofhcam.org/journal/journal.html, 2002-01-08