Re: Unicode question
- From: Hans-Peter Diettrich <DrDiettrich1@xxxxxxx>
- Date: Tue, 05 Feb 2008 02:11:35 +0100
Troy Wolbrink wrote:
type as parameter/variable at some point... or will just cut/trim
strings in the middle of a codepoint.
OK, I just thought of a scenario in a LOB app where this might actually bite you. It's when you have a database field with a maximum of 50 (for example) characters, and you auto-cut the string to the first 50 16-bit characters.
If you have a database with UTF-16 string encoding... ;-)
When it comes to the size of a data field, the limit is either the number of elements (regardless of their size), or no limit at all. Adjusting an UTF-8 string to a maximum number of bytes is as uncomfortable as with UTF-16.
But in most times a string must be truncated for display purposes. In this case all UTF formats are unhandy. Splitting at well known separation characters should not be a problem. Much nastier is tab expansion, regardless of the encoding.
But let me ask this (sincerely)... What languages are not handled within standard 16-bit Unicode? Are we dealing with extremely rare language groups? It seems like you get alot of mileage with approx 64,000 code points (2^16)!
AFAIR some character sets (not natural languages, besides Chinese) reside in the high code space.
I agree that really international applications deserve much more efforts than only the representation of text in foreign code pages. Who has ever been thinking about wordwrap or hyphenation in Chinese texts? Or about proper handling of RTL reading, with embedded LTR for numbers, and including the placement of labels at the right(! ;-)) side of some control? No, folks, writing true international apps deserves a bit more than Unicode. I for my part am happy with UCS-2 and WideStrings for storing and retreiving text, in languages where I can be halfways sure that my app makes sense for. UTF-8 for string handling will be a choice only on (non-Windows) systems, where the GUI and possibly the API is based on UTF-8 strings.
DoDi
.
- Follow-Ups:
- Re: Unicode question
- From: Troy Wolbrink
- Re: Unicode question
- References:
- Unicode question
- From: Bob
- Re: Unicode question
- From: Eric Grange
- Re: Unicode question
- From: Troy Wolbrink
- Re: Unicode question
- From: Eric Grange
- Re: Unicode question
- From: Troy Wolbrink
- Unicode question
- Prev by Date: Re: Unicode question
- Next by Date: Re: Unicode question
- Previous by thread: Re: Unicode question
- Next by thread: Re: Unicode question
- Index(es):
Relevant Pages
|