Re: Theodore Adorno, a prophet of data systems design

From: Richard Heathfield (dontmail_at_address.co.uk.invalid)
Date: 01/08/04


Date: Thu, 8 Jan 2004 06:41:56 +0000 (UTC)

Edward G. Nilges wrote:

<snip>
 
> There is no one set of rules for good C practice that guarantees no
> memory leaks, overruns etc., and I can even prove this without knowing
> as much about C as Richard Heathfield.

But you can't. Your logic is flawed. I'll come to that in a moment. First,
I'll show that there /is/ a set of rules that guarantees no memory leaks or
overruns.

Rule 1) Do not invoke undefined behaviour, at all, ever, in any of your C
code.

Rule 2) Whenever you allocate memory, ensure that the program frees that
memory at some point. (I ensure this in my own programs by wrapping up
memory allocations in a well-defined way, and tracking them in the log
file. I have a program which reads the log file looking for leaks.)

> It goes as follows: if there was one set of rules for good C practice,
> "good C practice", in order to ensure safe TEAM development (whether
> as above parallel in the case of a development team, or serial in the
> case of a series of schnooks) would encapsulate these rules IN A NEW
> LANGUAGE...and this is precisely what Stroustrup did when he used the
> preprocessor to develop C++.

I have two objections to that. The first is a counter-example, showing that
it is possible to leak memory in C++:

// Counter-example with memory leak
int main()
{
  char *p;
  while(p = new char [1024]);
  return 0;
}

The second objection is that the C language leaves certain behaviours
undefined for a very good reason. It IS possible to stop programmers doing
stupid things, but only at the expense of greatly limiting the number of
*clever* things they can do. Buffer overruns are a case in point. It is
possible to design a language in which buffer overruns are impossible. But
could a language do that *and* allow direct access to memory-mapped
hardware, for example? Yes, obviously, but only at the expense of Yet
Another Language Feature, and a plethora of language features makes the
language harder to implement, harder to port, harder to learn, harder to
use, harder to debug, and harder to maintain; eventually the cost outweighs
the advantage.

C is not a language for stupid programmers, and Mr Nilges does well to avoid
it.

> Lemme put this another way. If a language is so filled with problems
> that the programmer has to know, in excess of computer SCIENCE, so
> much trivia, arcana, bop Kabalah, Lost Secrets of the Old Ones,
> Magick, palmistry, mesmerism and cold fusion, THEN the language is the
> problem.
>
> Thus I prove from outside C "expertise" (which I fear resembles bop
> Kabalah) that it is time to freeze new development in C.

No, you don't. Your misunderstanding of C has led you to believe, wrongly,
that the language is filled with problems. It isn't. It is filled, rather,
with opportunities.

> This mode of argument is based on my limited training

Indeed.

>> > Real users make dreck work by heroically working around phenomena that
>> > INCLUDE unpredictable behavior based on uninitalized variables (a
>> > practice fostered by C),
>>
>> but forbidden by C (by the standard anyway; what in case its a
>> trap value?) and only appears in the code of bad programmers.
>
> But, it appears and is not prevented as it is in VB.

If you want your objects to be initialised in C, initialise them. If you
don't, don't. Everything has a cost, including initialisation. If your code
writes to an object before it reads from it, the initialisation is a
redundant overhead.

Personally, I /do/ initialise anyway, because /I/ think it makes my
debugging easier if the program is in a deterministic state at all times,
but it's not for me to force my style decisions on others.

<snip>

>> > strange behavior near undocumented limits (a practice
>>
>> there are no undocumented limits, as far as I know. feel free
>> to prove me wrong by posting the list of undocumented limits.
>>
> Not in C. However, the way in which C allocates arrays and fails to
> provide hash tables as part of the language encourages secret decision
> making which should be the user's as when the programmer allocates the
> array as 1024 because 1024 seems like a big number.

C makes the allocation of fixed-size arrays possible because there are some
situations where fixed-size arrays are desirable. As the smallest example I
can think of, if for some reason you want to reverse an octet's bit order,
you might reasonably do it like this:

unsigned char reverse_octet(unsigned char ch)
{
  static unsigned char nybble_r[16] =
  {
    0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15
  };
  unsigned char result = nybble_r[ch & 0xF] << 4;
  ch >>= 4;
  return result | nybble_r[ch & 0xF];
}

No matter how much the user might like to think otherwise, an octet is never
going to have more than 8 bits (a byte is a different matter, obviously).
The name is a real giveaway, you see. So a 16-element nybble array is
always adequate. Fixed-size arrays /do/ have their place.

If you use a fixed-size array when you should have used a dynamic array,
that's your fault. C provides dynamic array support. If you don't use it
when it's appropriate, don't blame the language.

As for hash tables, they are trivial to implement using dynamic arrays. If
you're not able to write your own hash table, C is clearly not your
language.

> One of my issues with using C on a VMS system to develop a "C-like"
> compiler

If indeed you did...

> for business rules was I did not want to in any way bound the
> complexity of the rules. But all I had in C was the linked list
> perhaps with a hash index. I found the repeated coding of linked lists
> tedious and not suitable work for a gentleman.

If you'd been clever, you'd have written one linked list ADT, and re-used
it. That's really the whole point of C, you see - code re-use.

>> > fostered by the arrogant way in which C programmers use constant array
>> > limits), strange string behavior (a practice fostered by the absurd
>> > Nul limit of C),
>>
>> how is that a limit. the size of a C string is constrained only
>> by the memory you have available. any other method of storing strings
>> in memory imposes arbitrary limits.
>>
> It's a limit because a central computing insight is the refusal to
> treat any computing object as "special" because of the demands of
> one's system.

If you don't like null-terminated strings, C is powerful enough to let you
design and implement your own string model. I've done this myself. It's
hardly difficult.

<snip>

> The ideal is in fact to be able to create a string, and be damned.
> There should be absolutely no limit on the potential size of the
> string.

I see you're back on the "long strings" phase of your cycle. So, while
you're in this mood, I will just ask quickly whether you consider a string
size limit of 32 bytes to be acceptable.

> This may be theoretically impossible.

It is, if you want to store this string. Ultimately, there are only so many
electrons to go round.

> The second best is to make sure that the strings, that cannot be
> admitted to the community of strings, are truly special in so many
> ways that they are truly unusual by definition (strings containing NUL
> are common). And, representing string length as a long int
> accomplishes this goal.

Or a size_t, which can double the range of string lengths you can represent.
Or a pair of pointers, one of which points to the end of the string.
(Personally, I use three pointers; never mind why.) Or perhaps there's some
even cleverer way that I haven't thought of. The point is, whatever way you
choose, there's likely to be a way you can do it, in C.

>> > and the eternal green screen (a practice fostered by
>> > C's favorite OS).

C is OS-blind.

<snip>

-- 
Richard Heathfield : binary@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton


Relevant Pages

  • Re: Theodore Adorno, a prophet of data systems design
    ... I have a program which reads the log file looking for leaks.) ... > The second objection is that the C language leaves certain behaviours ... the way in which C allocates arrays and fails to ... >> The ideal is in fact to be able to create a string, ...
    (comp.programming)
  • Re: Theodore Adorno, a prophet of data systems design
    ... C may not be a forgiving language, ... >> attention get burned with the memory leaks, and buffer overruns, ... by a small, compact team of brilliant programmers, or it may be ... so you agree that it is not a limit on the length of the string, ...
    (comp.programming)
  • Re: Theodore Adorno, a prophet of data systems design
    ... C may not be a forgiving language, ... >> attention get burned with the memory leaks, and buffer overruns, ... by a small, compact team of brilliant programmers, or it may be ... so you agree that it is not a limit on the length of the string, ...
    (comp.programming)
  • Re: Strings, arrays and efficiency
    ... PHP is not Python. ... What's good or bad in one language has ... continual memory reallocation? ... between adding to a string and adding to an array. ...
    (comp.lang.php)
  • Re: Aspects of programming languages in common
    ... >> scalars) could be defined once, in one language, this definition to be ... For example, arrays are not real ... > languages have a real string type. ... > In fundamental ways C and Pascal strings are different data types. ...
    (comp.programming)