Re: Bounds checked arrays

From: Nick Landsberg (hukolau_at_att.net)
Date: 02/15/04


Date: Sun, 15 Feb 2004 02:27:24 GMT


jacob navia wrote:

> As everybody knows, the C language lacks
> a way of specifying bounds checked arrays.
>
> This situation is intolerable for people that know
> that errors are easy to do, and putting today's
> powerful microprocessor to do a few instructions
> more at each array access will not make any
> difference what speed is concerned.
>
> Not all C applications are real-time apps.
>

But for those applications which are real-time
apps, the overhead for the bounds checking may
well be intolerable.

> Besides, there are the viruses
> and other malicious software that are using
> this problem in the C language to do their dirty
> work.

Actually, they are using the undisciplined coding
practices of dilletante C-coders to do their
dirty work.

>
> Security means that we avoid the consequences
> of mistakes and expose them as soon as possible.

Yes, if possible at compile time. Else, institute
real coding practices (rather than bogus ones which
say "use fixed arrays rather than malloc") and have
the code inspected by experts. A flag in the compiler
to do as much strict bounds checking as possible at
compile time would go a part of the way to this end.

>
> It would be useful then, if we introduced into C
>
> #pragma STDC bounds_checking(ON/OFF)
>
> When the state of this toggle is ON, the compiler
> would accept declarations (like now)
>
> int array[2][3];
>
> The compiler would emit code that tests
> each index for a well formed index.
> Each index runs from zero to n-1, i.e.
> must be greater than zero and less than
> "n".
>
> In arrays of dimension "n", the compiler would
> emit code that tests "n" indices, before using
> them.
>
> Obviously, optimizations are possible, and
> good compilers will optimize away many tests
> specially in loops. This is left unspecified.
>
> Important is to know that the array updates
> can't overflow in neighboring memory areas.

As someone else pointed out, the calling sequences
for all subroutine calls may have to change
to pass the limits. If not, then, the implementation
may have to pass these without the programmer knowing
about it. The C language has a long-standing tradition
that there is a well known "sentinel", i.e. NULL,
which indicates the end of the array for character types.
Of more importance, what should the behaviour be
when array bounds are exceeded. You are proposing
a new standard behaviour which includes bounds checking.
(This, as someone else pointed out, would be an extension
to the language and would probably not be considered
unless there was at least one existing implementation,
an "existance proof."

What should be the "standard behaviour" if the bounds
checks fail? Proposing a solution to a percieved
problem without proposing appropriate behaviour
when something like that happens, is, IMO, half
a solution.

There are many languages out there which perform
bounds checking. Elsethread, there are many languages
which do not have pointers. There is a need for
such languages, otherwise they would not be there,
but they are NOT C.

>
> How many machine instructions does this cost?
>
> Each test is a comparison of an index with a
> constant value, and a conditional jump. If the
> compiler only emits forward branches, the
> branch predictor can correctly predict that in
> most cases the branch will NOT be taken.
>
> In abstract assembly this is 4 instructions:
> test if index >= 0
> jump if not "indexerror"
> test if index < "n"
> jump if not "indexerror"
>
> where "n" is a compile time constant.
>
> We have something like 4 cycles then, what
> a 2GHZ machine does in 0,000 000 004 seconds.
>
> Yes, table access is a common operation but
> it would take millions of those to slow the program
> a negligible quantity of time. We are not in the
> PDP-11 any more.

I differ with your analysis of the number of assembly
instructions, but that's a nit-pick. I work on systems
which need to do upwards of 10,000 database lookups
per second and the same order of magnitude of parsing
strings, etc. They involve copying information from one
memory space to another. For those applications we
use C. For applications with less stringent requirements,
e.g. "only" 1,000 database accesses per second, we use Java.

C has it's place, Java has it's place, other languages
have their place.

On the C applications, we take great care NOT to use
dubious constructs and code reviews by the lead
developers are required before the code even gets
to system test. (No, it does not catch all the
problems.) There is a discipline involved.
If you don't have that discipline, use another
language. (This last was not meant as a flame,
rather a simple statement of fact.)

>
> This would make C a little bit easier to program,
> and the resulting programs of better quality.
>

[Much Snipped]

>
> jacob
>
>

-- 
Ñ
"It is impossible to make anything foolproof because fools are so 
ingenious" - A. Bloch


Relevant Pages

  • Re: WANTED: InfoWorld Editorial, August 18, 1980, Vol.2, No.14, p.8
    ... Although CBASIC is not an optimum language, ... compiler before any error checking is performed, ... in order to run both new applications and previously-purchased ... the interpreted and compiled BASICs, ...
    (comp.os.cpm)
  • Bounds checked arrays
    ... As everybody knows, the C language lacks ... When the state of this toggle is ON, the compiler ... Important is to know that the array updates ... We have just to allow him/her to specify what to do ...
    (comp.lang.c)
  • Re: Linked Lists debugger question
    ... With C++/CLI extensions, you can use this RAD paradigma for building GUIs ... I don't think I will be doing any web based applications. ... It has both a very good IDE and a very good C++ compiler. ... it still the language of choice for develloping PC based applications? ...
    (microsoft.public.vc.language)
  • Re: Bounds checked arrays
    ... > more at each array access will not make any ... > this problem in the C language to do their dirty ... > When the state of this toggle is ON, the compiler ... > We have just to allow him/her to specify what to do ...
    (comp.lang.c)
  • Re: Teaching new tricks to an old dog (C++ -->Ada)
    ... > attributes helps both the writer and the compiler. ... You can build algorithms around language provided ... > Maybe, if one gets used to template programming, the basis of the ... > Try telling someone that useful array attributes, ...
    (comp.lang.cpp)