Re: A note on computing thugs and coding bums



Malcolm McLean said:


"Richard Heathfield" <rjh@xxxxxxxxxxxxxxx> wrote in message
(Expletives deleted.) C does not require that implementations use ASCII.
It
allows them to use ASCII. If an implementor wants to use some other
character set (EBCDIC, Unicode, some other standardised set, or even a
custom character set), the C Standard endorses that decision subject
only to the very minor restrictions that: (a) the null character is the
character whose bits are all set to 0; (b) the required character set
characters must have positive values; (c) the digit characters '0' to
'9' have values that are contiguous and which ascend in the obvious way,
with '0' first and '9' last within the digit group. This is not a slap
in anyone's face, but a broad and liberal approach to character sets.

And sizeof(char) must equal one.

Is 1, by definition. But this doesn't mean that character set sizes are
restricted.

Which means that in practise you must use ASCII or face endless hassles.

No, it doesn't. For one thing, EBCDIC is just as easy to use as ASCII
(indeed, one might say it's easier in some respects). For another, C does
not require that implementors limit themselves to eight-bit bytes. It
requires that bytes must be *at least* eight bits wide, but they can
certainly be wider. If implementors wish to set CHAR_BIT to 9 or 13 or 16
or 17 or 31 or 32 or 57 or 119 or whatever (for values of "whatever" that
exceed 7), well, that's entirely up to them.

Also there is the issue of embedded strings. Ity is extremely convenient
to be able to embed strings literal in sourcecode, although of course it
effectively ties you to English.

Presumably you mean the fact that the Standard allows implementations
considerable latitude when faced with source code containing characters
that are not in the required basic character set. I assure you that this
doesn't tie programmers to English. On Y2K I had to deal with a
considerable body of source code with comments (and identifiers!) written
in Italian.

Nevertheless I do see your point - and the way to deal with this is to keep
the prompts in a file that is made available at runtime, rather than in
the source. Yes, it's less convenient, but it's not /that/ big a deal. Not
a hassle you'd want to go through for Janet&John programs, though,
obviously.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
.



Relevant Pages

  • Re: string literal is an lvalue; other literals are rvalues.
    ... the Standard /does/ specify that the basic source ... character set is a subset of the ASCII character set: ... corresponds to the ASCII character set. ... using it to try to get an understanding of what parts of the source code ...
    (comp.lang.cpp)
  • Re: wcsftime output encoding
    ... an execution character set ... the source character set members in character ... member of the execution character set. ... Is there any standard way of determining what the execution character ...
    (comp.lang.c)
  • Re: wcsftime output encoding
    ... an execution character set ... the source character set members in character ... member of the execution character set. ... Is there any standard way of determining what the execution character ...
    (comp.os.linux.development.apps)
  • Re: wcsftime output encoding
    ... an execution character set ... the source character set members in character ... member of the execution character set. ... Is there any standard way of determining what the execution character ...
    (comp.unix.programmer)
  • Re: detab utility challenge.
    ... getc() even when no error occurs, and the end of file hasn't been ... add that you need something else as well -- a very odd character set. ... The "get" functions return a character "as an unsigned char converted ... standard mandates that a successful call to putcmust convert it's ...
    (comp.lang.c)