Re: A note on personal corruption as a result of using C
- From: spinoza1111 <spinoza1111@xxxxxxxxx>
- Date: Mon, 18 Feb 2008 09:37:40 -0800 (PST)
On Feb 18, 3:25 pm, "Clive D. W. Feather" <cl...@on-the-
train.demon.co.uk> wrote:
In article <87fxvr6m4p....@xxxxxxxxxxxxxxx>, Keith Thompson
<ks...@xxxxxxx> writes
spinoza1111 <spinoza1...@xxxxxxxxx> writes:
[...]
THE LIES OF C
"A string cannot contain Nuls" Yes it can. Data arrives as strings and
needs to be validated, but C makes it impossible. It becomes
impossible to write effective string validation routines by definition
if the input isn't even a string: garbage in, garbage out was never
truer.
The C standard provides its own definition of the term "string",
namely (C99 7.1.1p1):
I wouldn't bother. Nilges refuses to accept that there are two common
ways of representing strings:
(1) Count plus array of characters; this can include any character in
the string, but is limited by the maximum value of the count.
It doesn't have to be. If the hardware or software supports unbounded
integers, and these are encapsulated within the string object, then in
fact there is no apriori limit to the string length.
Alternatively, the object representation could use an int or long with
a bound, but maintain longer strings, gracefully if with some gracious
regret returning NAN or "unknown" when the string exceeds the bound
without breaking. What part of "fault tolerance" don't you understand?
These representations are uncommon. Perhaps that's the problem, and
perhaps you hound people like Schildt because they, on real computers
having to solve real problems, pay for your mistakes and lack of
imagination.
(2) Array of characters with a terminator; this has no limit on the
string length, but the terminator can't appear anywhere else in the
string.
C uses the second representation, some other languages the first. I
don't know why K&R made that choice, but it may have been to allow
easier pointer manipulation of strings. Nilges, however, believes it was
an anti-Vietnam protest.
Do me the courtesy of quoting me accurately. I can see clearly enough
that you have far lower ethical standards than used to obtain in the
computer industry when I was starting out.
Perhaps you'd like to have a way to represent a sequence of characters
that can include embedded null characters. C's standard "string"
doesn't give you this (for example, strlen() computes the length of a
string up to but not including the terminating null character), but
you can certainly create your own data structure that supports this.
For example, you could have a structure consisting of a length and an
array of characters.
Indeed.
One could certainly argue that C's choice of this particular
representation for strings was a bad idea.
I wouldn't go that strong. It has its advantages and one small
disadvantage. I can't remember being harmed by that disadvantage, but,
there again, I'm used to coding around it.
Or perhaps you object to C
choosing to apply a fairly generic term "string" to one specific
representation.
There again, I could object to BCPL using the term "string" to mean
something that can't have a length greater than 255 characters. So long
as the context is clear, I don't think it actually hurts for C to do
this.
"A string is 8 bytes" No it isn't and this has not been the case since
under Deng Xiao Peng China entered the real world.
It took me a while to figure out what you meant by this. Of course a
string isn't 8 bytes, unless it just happens to have that length. I
think what you meant is that C claims that a string is composed of
8-bit characters.
If this is what he means, it's a quick reversal. Only a few weeks ago he
seemed to be objecting to the idea that a byte might be more than 8
bits. To quote:
| -Nilges-> To say that C can with anything but clumsiness handle wider
| (or narrower) characters is false, since it is so falsely "powerful"
| that to change a C program to handle other than 8 bits implies a line
| by line audit. Herb is pragmatically correct.
I may also be mixing this up with something else, but I believe he also
objected to ASCII being described as 7 bits.
Do me the courtesy...
But yes, in practice, CHAR_BIT==8 on all real-world hosted
implementations that I'm aware of.
In particular, this is a requirement of POSIX (something they added
after I pointed out the implications of not having it).
C does provide a wchar_t type
capable (at least potentially) of representing larger character sets,
though support for it is somewhat sketchy.
Um, what is missing? Noting that a *lot* got added in C94.
"Aliasing gives me power" No it doesn't. It means that you have been
too lazy to find an effective algorithm.
I don't know what you mean by that. Is it something to do with what's
sometimes called "type punning"? Perhaps an example would help.
I suspect it's a reference to something I wrote in 1994:
========
However, what is important is why
only some lvalues can refer to a given object, and the annotations
completely skip this. The reason is, of course, to indicate when a
compiler can assume that two identifiers refer to the same object.
For example, in:
char *cp;
int *ip;
void f (double *d)
{
*d = 3.14159;
*cp = 1;
*ip = 2;
}
The rules of this section say that the assignment to *cp could
potentially alter *d, and the compiler must generate code that takes
that into account, but the assignment to *ip cannot, and the compiler
may assume that *d and *ip do not overlap. This is called aliasing,
and knowing when aliasing takes effect is an important factor in
correctly optimising code.
========
and his recent response:
| -Nilges-> The only way of "correctly optimizing code" is not to use
| aliasing so pathologically but to intelligently use an optimizing
| compiler. The compiler determines whether the lValues can refer to the
| same object. The programmer should avoid using global variables as
| much as possible; this is the real lesson of the above crap code,
| along with the need to organize things into structs when they are
| global.
"Don't make me think! Just make the behavior undefined in the
standard!" Up yours, pal.
The standard does not define the behavior of all possible constructs.
He doesn't seem to like this concept. Indeed, at times, he believes that
the *correct* meaning of any C construct is what his MS-DOS compiler
produced.
Perhaps you'd like to design a systems programming language that
doesn't have this problem. Perhaps you could make technical points
without resorting to abusive language.
The time for this has passed. As Malcolm here has pointed out, people
are now more interested in personalities, and in the register of
dominance and control, than new languages. I am, however, developing a
new language, called spinoza. You can bet your ass that I won't
discuss it here.
I wouldn't hold your breath.
"A regular expression is what my code can handle" No, it isn't: the
theory was developed before computers.
I don't understand whatever point you're making here. If you're
refuting a claim that someone has made, I don't recall seeing it.
I *think* this is something to do with a regular expression parser in a
book called "Beautiful Code" which doesn't handle Unicode or certain
badly-formed regexes. But I must admit to only skimming those threads.
"A struct is a class" No, it isn't.No such claim has been made with regard to C, because C doesn't have
classes.
Indeed.
"A for is just while with sugar": no it isn't. The for loop needs to
evaluate invariants before it starts, but if you send a boy (Ritchie)
to do a man's job, you get a useless for which might as well be a
while.
Surely you could have made whatever point you're making without
personal insults.
A C for loop is just a while with sugar, as you acknowledge in the
same paragraph ("migh as well be a while"). Perhaps what you mean is
that a C for loop isn't defined the way you want it to be.
No, it's defined so as to provide no functionality over and above
while, that is the ability to evaluate the invariant properly (once).
You proceeded to make the same mistake in your bubble sort. And your
Once upon a time, Edward wrote a piece of code that went something like
this (I haven't memorised it, so exact details might be wrong):
for (intIndex = 0; intIndex < strlen (strString); intIndex++)
strCopy [intIndex] = strString [intIndex];
Don't worry too much about the detail; the key points are the use of
strlen in the loop test and the fact that the string is not modified by
the loop body.
Somebody pointed out that this was inefficient because strlen() gets
called every time round the loop and it would be better to compute it
once and store it in a variable. For some reason he seems to think that
this was a personal attack and, rather than accepting he misremembered
the semantics of loops in C, claims that the condition *should* be
evaluated only once. Or something like that - it's not always easy to
understand our Edward.
tone is vicious here, because you know you did, and you also know that
because people are afraid of your status, you weren't hounded.
C needs a true for in which the invariant parts of the subexpression
are evaluated once and then stacked so as to be reusable in the for.
This is the way I implemented the quick basic for in my compiler, and
it is the right way, since in most for loops, the end of the thing
being processed IS an invariant.
Indeed, in your bubble sort, you used the invariant without thinking
that it would be evaluated in C repeatedly because a significant
difference between while and for in GOOD code is that the values in
the while may change whereas for is "for" fixed size objects.
Much later I *very* briefly mentioned the idea of a "parallelising C"
which might write:
for (i in eachof [0, strlen (s) - 1])
d [i] = s [i];
or something like that. Clearly it's a conspiracy that nobody told me
off for putting a function call in the loop condition.
"Here's the preprocessor. Don't use it.": a Biblical injunction:
here's the apple tree, guys. Don't eat the apple and don't drink the
Kool Ade: but you must and will, and I'm God.
Who exactly is claiming to be "God"?
The C preprocessor is a powerful tool, and it's extremely easy to
abose and/or misuse it.
I *think* the history of this is something like:
Nilges: memcpy is dangerous.
Others: not if used properly.
Nilges: yes it is, because a programmer can write
#define memcpy something_else
and you can't spot the breakage without auditing every single line of
the code. Other languages don't have this kind of global scope.
Others: nobody sane would do that in the way you've written.
Nilges: (sigh) what part of crazy code generation by automated tools
written by clowns are your REAL competence level (as opposed to the
competence level you project recreastionally) don't you understand?
Haven't you ever worked for a computer manufacturer or a software
company which uses preprocessor commands pathologically to satisfy
multiple customers, Horatio?
As shown elsewhere (look for TICKS_PER_SEC (sic) if you want examples)
the difference between #define in standard headers and in user code. And
he doesn't accept the idea of sharp tools that can be dangerous if
misused.
I'm sickened by the replacement of truths by metaphors, which geeks
use to project a false masculinity. Computer software isn't a tool,
it's a text.
He also hasn't spotted that his beloved Algol has features which can do
As I have said, Algol was drowned at birth. As such it never became my
"beloved" Algol. Instead, I happen to be an admirer of the integrity
of its developers, and their wide culture outside of technology: the
Revised report quoted Wittgenstein.
far more damage than the C pre-processor; I can write a line at the
*end* of an Algol program that completely changes the meaning of the
rest of the code and would require a line-by-line audit to spot.
"Pointers are unsigned integers and as such compareable. No they
aren't. Yes. Just kidding, they aren't.": C "experts" often sound like
the villain in the movie Dodgeball, "White" Goodman.
No, pointers aren't unsigned integers. They can be compared *as
pointers*.
I haven't the slightest idea what he's complaining about here.
"Here's post and pre increment. Don't try to guess when they happen":
there weren't enough lies in 1999, so yippee let's standardise the
language and add more!
The C standard does not define the behavior of certain expressions
such as ``i=i++;''. It states this clearly and explicitly. You can
argue that this is poor language design, but how is this a "lie"?
Perhaps you're using the word "lie" in some non-traditional way.
You need the full backstory here.
Once upon a time there was someone called Herb Schildt, who Edward seems
to think is a demigod or at the least a beatified martyr ...
Oh, *** you, Clive: *** you, very much. Your illogic is stunning.
The logical negation of "x is a victim" is NOT "x is a demigod",
unless you think like a child...which you do.
read more »
.
- Follow-Ups:
- Re: A note on personal corruption as a result of using C
- From: Ben Bacarisse
- Re: A note on personal corruption as a result of using C
- References:
- A note on personal corruption as a result of using C
- From: spinoza1111
- Re: A note on personal corruption as a result of using C
- From: Keith Thompson
- Re: A note on personal corruption as a result of using C
- From: Clive D. W. Feather
- A note on personal corruption as a result of using C
- Prev by Date: Re: Petition for the removal or voluntary departure of Richard Heathfield from this newsgroup
- Next by Date: Re: Petition for the removal or voluntary departure of Richard Heathfield from this newsgroup
- Previous by thread: Re: A note on personal corruption as a result of using C
- Next by thread: Re: A note on personal corruption as a result of using C
- Index(es):