Re: Attention: European C/C++/C#/Java Programmers-Call for Input




Did you miss the key point? *UNICODE*. They very specifically choose a
*standard* for their encodings, not something incompatible and
proprietary. In particular, it's very useful to be able to write comments
and strings in Unicode - many modern languages allow it. If you had
suggested using Unicode, or Latin-1, or listened to the idea when it was
suggested, then you'd have got far more support - it's the idea of have a
proprietary half-baked encoding that is incompatible with every other tool
that is "incredibly stupid".

My fault for phrasing my original question badly. I should
never have mentioned the words "character set". Forget that
there is an internal encoding method that is used in the compiler
tools for this new language whose codes will never be seen by its users.
The programming lanugage supports only a subset of the complete
UNICODE character set regarding the Western European
alphabetics. The language only recognizes a maximum of 254
alphanumerics (Basic Greek and Cyrillic are included) for variable
names etc. including the underscore which is regarded as alphabetic
but ordinally precedes all others. If Western European
programmers had to choose a subset of these for language
support, which ones would they be?

But I gather now that European programmers, for the most part,
don't care because these localized characters wouldn't be used in
their programming anyway because of the inter-operability
problems that arise when they are applied to source code. Since
the programmers I speak of are not interested in them, but space
has been allocated for many of them, I can take the huge tome of
UNICODE characters and make the choices myself, a naïve American :)
But I will also consider other subsets (some of which have been suggested
by helpful posters) in the process of making my final decision.

Thank you (really) for your input.

Paul


.



Relevant Pages

  • Re: Proposal: require 7-bit source strs
    ... If the application knows which encoding it is so it can convert at all, ... If you mean 'limited' to some other character set than Unicode, ... is that because you think of Unicode as The ... > standard grows with its adoption. ...
    (comp.lang.python)
  • Re: Posting with XHR and ISO-8859-15
    ... Universal Character Set, regardless of the encoding used. ... that was not a problem before Unicode and the various Unicode ... encodeURIComponent() for the reason stated above, ...
    (comp.lang.javascript)
  • Re: Java Newbie Question: Character Sets, Unicode, et al
    ... Actually, Unicode is not really a character set in the way ASCII is, ... How these codes are concretely represented as bytes is what an encoding ...
    (comp.lang.java.programmer)
  • Re: C# and encodings
    ... different encoding than Unicode does (Unicode set uses three ... Character set - a set of valid characters (code points in Unicode ... one could technically add a separate codepage for UTF-7). ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Beyond ascii
    ... Only that the character set not be full Unicode. ... > in their own language even in the face of restrictions. ... programmers just knew these traps and avoided using them. ...
    (comp.lang.scheme)