Re: Supporting full Unicode
From: Ludovic Brenta (ludovic.brenta_at_insalien.org)
Date: 05/12/04
- Next message: Ludovic Brenta: "Re: "Must instantiate controlled types at library level." Why?"
- Previous message: Martin Krischik: "Re: Supporting full Unicode"
- In reply to: Björn Persson: "Re: Supporting full Unicode"
- Next in thread: Björn Persson: "Re: Supporting full Unicode"
- Reply: Björn Persson: "Re: Supporting full Unicode"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 12 May 2004 10:57:25 GMT
Bjorn Persson wrote:
> David Starner wrote:
>> they should have defined Wide_Character to be UTF-16 like Java did.
>
> Keeping in mind that in UTF-16 some characters take two bytes and
> others take four, how do you propose to define that type?
It is true that variable-width encodings such as UTF-16 or UTF-8 are
more difficult to handle than fixed-width encodings like UCS-2 or
UCS-4. Basically, if you want to do advanced processing of character
data, you may find it easier to first transcode it to UCS-4
(i.e. Wide_Wide_Character, 32 bits wide).
But UTF-8 is gaining momemtum. Originally intended as an external
encoding only, it is now in use as an internal encoding, too. I
suppose that it turned out that processing UTF-8 directly is not that
difficult after all. This is especially true if all you want to do is
localisation of software using gettext; in this case, you can use
UTF-8 as both your internal and external encoding without any trouble.
The Perl regular expression engine, for example, supports UTF-8
strings directly. I don't know if it transcodes to UTF-4 internally.
-- Ludovic Brenta. -- Use our news server 'news.foorum.com' from anywhere. More details at: http://nnrpinfo.go.foorum.com/
- Next message: Ludovic Brenta: "Re: "Must instantiate controlled types at library level." Why?"
- Previous message: Martin Krischik: "Re: Supporting full Unicode"
- In reply to: Björn Persson: "Re: Supporting full Unicode"
- Next in thread: Björn Persson: "Re: Supporting full Unicode"
- Reply: Björn Persson: "Re: Supporting full Unicode"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|