Re: Unicode Delphi Win32 - which approach
- From: Help <nospam@xxxxx>
- Date: Sat, 09 Jun 2007 15:34:08 +0800
"m. Th." <a@xxxxx> wrote:
What are, in your opinion, the disadvantages of string ( := UTF-16) compared
with string ( := UTF-8)?
I like the backwards compatibility aspects of UTF-8 vs UTF-16. While the
UTF-8 encoding is different from ANSI, at least it's still byte oriented
like 'most' streams of data. Also there's the space saving aspects. In
general UTF-8 is a clever piece of design and tight architecture, a good
way to encode multiple width character sets.
Also I appreciate the fact that by using UTF-8, a non fixed width
encoding, programmers will be forced to "think" Unicode, and not
incorrectly assume that Unicode = 2 byte character set.
Because we are mainly on Windows (at least for the time being) I'd rather prefer
an UTF-16 encoding. It seems a more strategical approach but I don't know what
work implies this in the inners of VCL.
Good point. Also I think Delphi.Net and .Net in general is all based on
UTF-16. (and let's face it, this will be the main reason why CodeGear
will be looking towards UTF-16)
Endianness: The Windows native.
Again, with UTF-8 we'll never even need to make that distinction.
As an aside, also Java and Mac OSX uses UTF-16. Also, on Linux side Qt uses it.
It seems that it will be the future.
Yep. However, in terms of source level compatibility ideally there
really shouldn't be any difference in source code using UTF-16 and UTF-8
encoding.
Unicoding Delphi is not a trivial task. There's so many considerations.
Old code can't be broken. Unicode creeps into so many unexpected places.
(every tried to Zip a Unicode filename with Winzip?)
Then again, the OS has been almost 100% Unicode based ever since NT4. So
there's no excuse for Delphi not to embrace Unicode 100%.
For a programmer I think the biggest change will be the need to mentally
and explicitly contextualise every string.
Beforehand most programmers didn't even think consciously about what was
"in" a string, implicitly assuming that it was just a byte string of
(ANSI) characters. And now we need to move to an extended concept.
.
- Follow-Ups:
- Re: Unicode Delphi Win32 - which approach
- From: Arthur Hoornweg
- Re: Unicode Delphi Win32 - which approach
- From: Maël Hörz
- Re: Unicode Delphi Win32 - which approach
- From: m. Th.
- Re: Unicode Delphi Win32 - which approach
- References:
- Unicode Delphi Win32 - which approach
- From: Help
- Re: Unicode Delphi Win32 - which approach
- From: m. Th.
- Unicode Delphi Win32 - which approach
- Prev by Date: Re: IntraWeb help?
- Next by Date: Re: The New Roadmap
- Previous by thread: Re: Unicode Delphi Win32 - which approach
- Next by thread: Re: Unicode Delphi Win32 - which approach
- Index(es):
Relevant Pages
|