Re: how to get dollars($) in c
- From: $)CHarald van D)&k <truedfx@xxxxxxxxx>
- Date: Sun, 11 Nov 2007 18:21:58 +0000 (UTC)
On Sun, 11 Nov 2007 18:10:01 +0100, Tor Rustad wrote:
Harald van Dijk wrote:
On Sat, 10 Nov 2007 01:51:15 +0100, Tor Rustad wrote:
Ben Pfaff wrote:
Paro <zubair.jhara@xxxxxxxxx> writes:#include <stdio.h>
Subject: how to get dollars($) in c`$' is not in the basic source or execution character set and hence
cannot be used in portable C programs.
int main(void)
{
printf("C99 has UCN, so \u0024 can be used in identifiers, "
"character constants, and string literal. See
6.4.3\n");
return 0;
}
While you are correct that C99 has UCNs, please keep in mind that not
all characters need be supported.
The intent of UCN, was to provide a facility for e.g. the Asian
languages, to let them write strictly conforming programs using
characters outside the basic character set.
Almost, but not exactly. From the rationale:
[...] Thus, \unnnn can be used to designate a Unicode character. This
way, programs that must be fully portable may use virtually any
character from any script used in the world and still be portable,
provided of course that if it prints the character, the execution
character set has representation for it.
Fully portable programs can use \u00AA e.a. in identifiers, but fully
portable programs cannot meaningfully use \u00AA in character constants.
The only C99 like compiler I currently have access to is
gcc -std=c99
Consider adding -pedantic. With that option, you'll at least get gcc in a
mode where it attempts to conform to C99, even if it doesn't yet. And
also, see below.
which on my installations, defines __STDC_ISO_10646__. I expect Unicode
characters to be supported, by the time we can start writing portable
programs in C99! Until then, it doesn't matter much (for me).
__STDC_ISO_10646__ is optional, so an implementation might choose to not
support it at all. If it is defined on your implementation, great, you
can rely on it on your implementation, but only there.
The only case I can think of where using \u0024 in character constants
and/ or string literals makes sense is if the source character set is
different from the execution character set, and the dollar sign is only
supported in the latter. I suspect this is quite rare.
The way I picture it, isn't that programmers usually write these UCN
sequences by hand via U+NNNN. My keyboard has some special characters
too: "æøåÆØÅ", which I can access via a single keystroke.
However, if editing C source from another machine, e.g. using putty.exe
(getting a Linux login shell over SSH from Windows), those keystrokes
doesn't produce the expected result. Hence, I can't type
setlocale(LC_CTYPE, "")
printf("%ls\n", L"æøå"); <--- can't type
hard coding these Latin 1 characters doesn't work either:
printf("\xE6\xF8\xE5\n"); <--- display garbage
while, not only does this work:
printf("\u00e6\u00f8\u00e5\n"); <--- can type & display correctly
but can be typed from "everywhere".
You're assuming the locale will always be the same every time the program
is run. It's better to use %ls as in your original version,
setlocale(LC_CTYPE, "");
printf("%ls\n", L"\u00e6\u00f8\u00e5");
to minimise this assumption. That said, it's a step up from requiring
L"æøå", but still, this will work on implementations that support those
characters, and won't on those that don't.
> And for readability, I would not recommend using it in identifiers at
> all, even when it's allowed.
UTF-8 enabled editors, could be able to display the C source in a
readable way.
Not unless you want to deal with the same problems again that you already
do: if an UTF-8 enabled editor automatically converts UCNs to/from UTF-8,
you might be able to enter L"\u00e6\u00f8\u00e5", but when reading it
back, if it tries to show L"æøå", it will quite likely give you garbled
output.
IMO, the main advantage for restricting source to be
written in English, is that "anyone" can maintain the source afterwards,
but that may not be important to e.g. a Japanese SW company.
For readability
int år, aar, year;
'år' is not only the most readable form for me, but is also the quickest
one to type (requires only two keystrokes). BTW, neither
int år, \u0005r;
works with latest GNU GCC, so UCN support appears to be broken. <g>
The former doesn't work because there is no required automatic conversion
from any character to an UCN. An implementation may choose to do it in
translation phase 1, but it's in no way required. The latter doesn't work
because \u0005 is not a valid UCN (6.4.3p2), and even if it were, it's
not a valid character in identifiers (6.4.2.1p3).
int \u00e5r;
is valid C99. <OT>It "works" with GCC when you enable the option to
support UCNs in identifiers (-fextended-identifiers). This is not enabled
by default, IIRC because the implementation is incomplete and/or broken,
so choose your interpretation of "works".</OT>
.
- Follow-Ups:
- Re: how to get dollars($) in c
- From: Tor Rustad
- Re: how to get dollars($) in c
- References:
- how to get dollars($) in c
- From: Paro
- Re: how to get dollars($) in c
- From: Ben Pfaff
- Re: how to get dollars($) in c
- From: Tor Rustad
- Re: how to get dollars($) in c
- From: $)CHarald van D)&k
- Re: how to get dollars($) in c
- From: Tor Rustad
- how to get dollars($) in c
- Prev by Date: Re: What's the deal with size_t?
- Next by Date: Re: I cant do change string to int.
- Previous by thread: Re: how to get dollars($) in c
- Next by thread: Re: how to get dollars($) in c
- Index(es):
Relevant Pages
|