Re: Theodore Adorno, a prophet of data systems design

From: Edward G. Nilges (spinoza1111_at_yahoo.com)
Date: 01/08/04


Date: 8 Jan 2004 13:58:02 -0800

Richard Heathfield <dontmail@address.co.uk.invalid> wrote in message news:<btiu3i$2ms$1@titan.btinternet.com>...
> Edward G. Nilges wrote:
>
> <snip>
>
> > There is no one set of rules for good C practice that guarantees no
> > memory leaks, overruns etc., and I can even prove this without knowing
> > as much about C as Richard Heathfield.
>
> But you can't. Your logic is flawed. I'll come to that in a moment. First,

No, you don't know the informal logic that applies to good practice
and for this reason you misapply the strict logic of the
counter-example, which doesn't disprove a normative rule.
 
> I'll show that there /is/ a set of rules that guarantees no memory leaks or
> overruns.
>
> Rule 1) Do not invoke undefined behaviour, at all, ever, in any of your C
> code.

This isn't a useful rule. Characteristic of a useful rule would be
that it would say "how" to avoid an undesirable result.

Advising programmers to memorize the standard is silly.

>
> Rule 2) Whenever you allocate memory, ensure that the program frees that
> memory at some point. (I ensure this in my own programs by wrapping up
> memory allocations in a well-defined way, and tracking them in the log
> file. I have a program which reads the log file looking for leaks.)
>
Which means, of course, that the leaks occur even for you.

My dear fellow, I have long since fallen prey to the false idea that
the more elaborate the test and bug prevention facilities, the more
likely they are to create, themselves, bugs, or to interact with other
facilities to create bugs.

Several objections therefore arise to your "leak log file".

(1) I would suppose that the log file goes on disk, but if your code
runs embedded or on a thin client, no disk is available.

(2) Program maintainers are constrained to follow the rules of the
language to compile the program but in no wise constrained to follow
your protocol especially if they find it incomprehensible.

(3) The solution won't work for multiple threads or processes.
 
> > It goes as follows: if there was one set of rules for good C practice,
> > "good C practice", in order to ensure safe TEAM development (whether
> > as above parallel in the case of a development team, or serial in the
> > case of a series of schnooks) would encapsulate these rules IN A NEW
> > LANGUAGE...and this is precisely what Stroustrup did when he used the
> > preprocessor to develop C++.
>
> I have two objections to that. The first is a counter-example, showing that
> it is possible to leak memory in C++:
>
> // Counter-example with memory leak
> int main()
> {
> char *p;
> while(p = new char [1024]);
> return 0;
> }
>
> The second objection is that the C language leaves certain behaviours
> undefined for a very good reason. It IS possible to stop programmers doing
> stupid things, but only at the expense of greatly limiting the number of
> *clever* things they can do. Buffer overruns are a case in point. It is

The time for such cleverness is PAST.

> possible to design a language in which buffer overruns are impossible. But
> could a language do that *and* allow direct access to memory-mapped
> hardware, for example? Yes, obviously, but only at the expense of Yet
> Another Language Fnareature, and a plethora of language features makes the
> language harder to implement, harder to port, harder to learn, harder to

Well boo hoo.

A clever stunt indeed can short-circuit the learning process. Many
programmers remain in what a psychologist would call the "mirror"
stage of development in which they discover a clever stunt and say,
what a good boy I am. Since society as presently constituted does NOT
allow young men to reach their full potential except in the military,
they remain at the mirror stage, narcissisitically enchanted by their
own cleverness.

I would in other words be rather more careful about whining that
things are "hard" for there are no end of wogs on the coral strand who
are willing to work hard.

> use, harder to debug, and harder to maintain; eventually the cost outweighs
> the advantage.
>
> C is not a language for stupid programmers, and Mr Nilges does well to avoid
> it.
>

Buttering me up, I see. Yet by implication it is easier to implement,
easier to port, easier to learn, easier to whatever in a rather
paradoxical way.

> > Lemme put this another way. If a language is so filled with problems
> > that the programmer has to know, in excess of computer SCIENCE, so
> > much trivia, arcana, bop Kabalah, Lost Secrets of the Old Ones,
> > Magick, palmistry, mesmerism and cold fusion, THEN the language is the
> > problem.
> >
> > Thus I prove from outside C "expertise" (which I fear resembles bop
> > Kabalah) that it is time to freeze new development in C.
>
> No, you don't. Your misunderstanding of C has led you to believe, wrongly,
> that the language is filled with problems. It isn't. It is filled, rather,
> with opportunities.

What, for a job security that's nonexistent?
>
> > This mode of argument is based on my limited training
>
> Indeed.
>
> >> > Real users make dreck work by heroically working around phenomena that
> >> > INCLUDE unpredictable behavior based on uninitalized variables (a
> >> > practice fostered by C),
> >>
> >> but forbidden by C (by the standard anyway; what in case its a
> >> trap value?) and only appears in the code of bad programmers.
> >
> > But, it appears and is not prevented as it is in VB.
>
> If you want your objects to be initialised in C, initialise them. If you

There are no objects in C.

> don't, don't. Everything has a cost, including initialisation. If your code
> writes to an object before it reads from it, the initialisation is a
> redundant overhead.
>
> Personally, I /do/ initialise anyway, because /I/ think it makes my
> debugging easier if the program is in a deterministic state at all times,

...which means you are not equal to the task of debugging a multiply
threaded program which will be in a nondeterministic state some of the
time, and in which your job is NONETHELESS to guarantee a
deterministic result in the sense of weakest post-condition.

> but it's not for me to force my style decisions on others.
>
> <snip>
>
> >> > strange behavior near undocumented limits (a practice
> >>
> >> there are no undocumented limits, as far as I know. feel free
> >> to prove me wrong by posting the list of undocumented limits.
> >>
> > Not in C. However, the way in which C allocates arrays and fails to
> > provide hash tables as part of the language encourages secret decision
> > making which should be the user's as when the programmer allocates the
> > array as 1024 because 1024 seems like a big number.
>
> C makes the allocation of fixed-size arrays possible because there are some
> situations where fixed-size arrays are desirable. As the smallest example I
> can think of, if for some reason you want to reverse an octet's bit order,
> you might reasonably do it like this:
>
> unsigned char reverse_octet(unsigned char ch)
> {
> static unsigned char nybble_r[16] =
> {
> 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15
> };
> unsigned char result = nybble_r[ch & 0xF] << 4;
> ch >>= 4;
> return result | nybble_r[ch & 0xF];
> }
>
> No matter how much the user might like to think otherwise, an octet is never
> going to have more than 8 bits (a byte is a different matter, obviously).
> The name is a real giveaway, you see. So a 16-element nybble array is
> always adequate. Fixed-size arrays /do/ have their place.
>
> If you use a fixed-size array when you should have used a dynamic array,
> that's your fault. C provides dynamic array support. If you don't use it
> when it's appropriate, don't blame the language.
>
> As for hash tables, they are trivial to implement using dynamic arrays. If
> you're not able to write your own hash table, C is clearly not your
> language.

Of course, there is an example at my Web site in virtualString.
>
> > One of my issues with using C on a VMS system to develop a "C-like"
> > compiler
>
> If indeed you did...
>
> > for business rules was I did not want to in any way bound the
> > complexity of the rules. But all I had in C was the linked list
> > perhaps with a hash index. I found the repeated coding of linked lists
> > tedious and not suitable work for a gentleman.
>
> If you'd been clever, you'd have written one linked list ADT, and re-used
> it. That's really the whole point of C, you see - code re-use.

No, a UDT (not ADT, whatever that is) would not solve the problem
because it consists of two sets of code: the structure declaration and
the n>=1 functions to implement what wants to be an object but cannot
be an object.

This code has to be manually inserted in a primitive way.

What comes closest is a UDT and a set of macro instructions carefully
classified as of the expression form (#define a ( )) or the statement
form (#define a { }) but the downside of using macro instructions is
very well known in terms of acceptablity in polite company.

It adds to C's well-known obfuscation a unique form of obfuscation in
which the maintenance programmer has no way of knowing what a class
act one is. This creates the epistemological program of trust and the
result is in practice that the maintenance programmer typically
decides to discard the macro instructions.

Code re-use is NOT the "whole point of C". In fact and in reaction to
the Multics culture, which did stress reuse, Kernighan and Ritchie
wanted to make the language sufficiently simple so that it would be
faster to write new code in C than, in some cases, to look up a
"reusable" solution.

>
> >> > fostered by the arrogant way in which C programmers use constant array
> >> > limits), strange string behavior (a practice fostered by the absurd
> >> > Nul limit of C),
> >>
> >> how is that a limit. the size of a C string is constrained only
> >> by the memory you have available. any other method of storing strings
> >> in memory imposes arbitrary limits.
> >>
> > It's a limit because a central computing insight is the refusal to
> > treat any computing object as "special" because of the demands of
> > one's system.
>
> If you don't like null-terminated strings, C is powerful enough to let you
> design and implement your own string model. I've done this myself. It's
> hardly difficult.
>
Oh. Cool. Was that before or after you learned about memstr?
 
> <snip>
>
> > The ideal is in fact to be able to create a string, and be damned.
> > There should be absolutely no limit on the potential size of the
> > string.
>
> I see you're back on the "long strings" phase of your cycle. So, while
> you're in this mood, I will just ask quickly whether you consider a string
> size limit of 32 bytes to be acceptable.
>
>
> > This may be theoretically impossible.
>
> It is, if you want to store this string. Ultimately, there are only so many
> electrons to go round.
>
> > The second best is to make sure that the strings, that cannot be
> > admitted to the community of strings, are truly special in so many
> > ways that they are truly unusual by definition (strings containing NUL
> > are common). And, representing string length as a long int
> > accomplishes this goal.
>
> Or a size_t, which can double the range of string lengths you can represent.
> Or a pair of pointers, one of which points to the end of the string.
> (Personally, I use three pointers; never mind why.) Or perhaps there's some
> even cleverer way that I haven't thought of. The point is, whatever way you
> choose, there's likely to be a way you can do it, in C.
>
This happens to be true of ANY TURING-COMPLETE PROGRAMMING LANGUAGE.
 
> >> > and the eternal green screen (a practice fostered by
> >> > C's favorite OS).
>
> C is OS-blind.

This isn't true. I taught SAS C for the IBM mainframe and despite the
fact that SAS C conformed to the standard, it could not expect the
same results for character comparisions as were expected in C running
on ASCII machines.

>
> <snip>