Re: Improving i18n support (Was: Re: Christmas



Larry W. Virden schrieb:
Kevin Kenny wrote:
Larry W. Virden wrote:
The most surprising thing to me, as a developer, is that I'm expected
to use obscure binary/octal/hex codes when attempting to write unicode
strings . That seems so... archaic.
That's surely a restriction of whatever editor you're using, not of Tcl.

Hmm - perhaps my description was either short sighted or misguided.
Here's the point I was trying to make. ascii strings like "\u084" or
whatever is how I most frequently see people code Unicode into tcl
scripts. However, after reading your comment, I now understand that
inserting Unicode binary directly into scripts is also an option.

However, that would be the equivalent of encoding the hex ascii string
or a binary character for a newline into a Tcl script.

I'm just wishing that, like \r, \n, \t and so forth, Tcl easily had
some sort of normal, 7 bit ascii shorthand for unicode characters, so
that, for instance, someone reading the code could tell what the
strings meant. Right now, attempting to read tcl scripts with Unicode
tell me nothing about the strings if the direct binary is present (I
don't have full unicode fonts available) and even the \u084 style
notation also doesn't tell me anything either.

One option to do something like your thinking about would be to use the
long names mentioned in the unicode tables, but in general they are
rather longish and embedding the true character as utf-8 or using the
hex code is a more convenient.

For the \u notation i find the unicode charts (as pdf) especially useful
if i don't have a usable font:
http://www.unicode.org/charts/PDF/

Michael
.



Relevant Pages

  • Re: Tcl_GetByteArrayFromObj and Utf-8 strings
    ... Tcl treats text as strings of Unicode characters. ... uses UTF-8 encoding for compatibility, the Tcl language itself and the ...
    (comp.lang.tcl)
  • Re: How to check variables for uniqueness ?
    ... characters is the sequence SS. ... is simply capitalizing strings. ... The fact that case mapping in English /is/ simple is neither here not ... That is a fair criticism of the Unicode position. ...
    (comp.lang.java.programmer)
  • Re: Dangerous behavior of CString
    ... If I'm reading a data file or serial port or something, if the raw data are multibyte but the compilation is Unicode or vice-versa, then sometimes the converting constructors in CString are convenient. ... I did not actually write code like this; in fact I was pretty careful always to use the _T macro with any literal strings. ... But it does the conversion using the current 8-bit code page, which is not what I want. ...
    (microsoft.public.vc.mfc)
  • Re: Help please
    ... i would like to provide "CSimString" class code because the settings ... I agree with Tom that first step is project clean and rebuild all. ... with a Unicode string, ... Consider that VS2005 strings are Unicode by default, ...
    (microsoft.public.vc.mfc)
  • Re: passing a string to a dll
    ... bool DLLRect::PullWhisker ... The interface for the DLL exported function could be like so: ... based on _UNICODE flag, e.g. ... I think that in these days those ANSI strings are something from the ...
    (microsoft.public.vc.mfc)