Re: Bounds checked arrays
From: Rob Thorpe (robert.thorpe_at_antenova.com)
Date: 02/17/04
- Next message: Mark Shelor: "Re: Mystery: static variables & performance"
- Previous message: Sidney Cadot: "[OT] Re: 'erf' function in C"
- In reply to: jacob navia: "Bounds checked arrays"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 16 Feb 2004 16:49:41 -0800
"jacob navia" <jacob@jacob.remcomp.fr> wrote in message news:<c0m3sa$96g$1@news-reader5.wanadoo.fr>...
> As everybody knows, the C language lacks
> a way of specifying bounds checked arrays.
>
> This situation is intolerable for people that know
> that errors are easy to do, and putting today's
> powerful microprocessor to do a few instructions
> more at each array access will not make any
> difference what speed is concerned.
>
> Not all C applications are real-time apps.
>
> Besides, there are the viruses
> and other malicious software that are using
> this problem in the C language to do their dirty
> work.
>
> Security means that we avoid the consequences
> of mistakes and expose them as soon as possible.
>
> It would be useful then, if we introduced into C
>
> #pragma STDC bounds_checking(ON/OFF)
>
> When the state of this toggle is ON, the compiler
> would accept declarations (like now)
>
> int array[2][3];
>
> The compiler would emit code that tests
> each index for a well formed index.
> Each index runs from zero to n-1, i.e.
> must be greater than zero and less than
> "n".
>
> In arrays of dimension "n", the compiler would
> emit code that tests "n" indices, before using
> them.
>
> Obviously, optimizations are possible, and
> good compilers will optimize away many tests
> specially in loops. This is left unspecified.
>
> Important is to know that the array updates
> can't overflow in neighboring memory areas.
>
> How many machine instructions does this cost?
>
> Each test is a comparison of an index with a
> constant value, and a conditional jump. If the
> compiler only emits forward branches, the
> branch predictor can correctly predict that in
> most cases the branch will NOT be taken.
>
> In abstract assembly this is 4 instructions:
> test if index >= 0
> jump if not "indexerror"
> test if index < "n"
> jump if not "indexerror"
>
> where "n" is a compile time constant.
>
> We have something like 4 cycles then, what
> a 2GHZ machine does in 0,000 000 004 seconds.
>
> Yes, table access is a common operation but
> it would take millions of those to slow the program
> a negligible quantity of time. We are not in the
> PDP-11 any more.
>
> This would make C a little bit easier to program,
> and the resulting programs of better quality.
> Buffer overflows happen of course, but the language
> limits the consequences by enforcing limits.
>
> By default the behavior is to stop the program.
> The user can override this, and different schemas
> can be specified by him/her to take actions when
> a buffer overflow happens.
>
> A simple strategy is to just do nothing.
>
> int fn(char *input)
> {
> char tmpbuf[BUFSIZ];
> int i=0;
> bool result = false;
>
> while (*input) {
> tmpbuf[i++] = *input++;
> }
> // Do things with the input
> // set result
> return result;
> indexerror:
> return false;
> }
>
> This function uses the built-in error checking
> to avoid any bad consequence for an overflow.
> If the input data is too long, it is a mal-formed
> input that should be discarded.
>
> This frees the programmer from the tedious task
> of writing
> if (i >= sizeof(tmpbuf)) goto indexerror;
>
> at EACH array access. This can be done better
> by a machine and the compiler.
>
> Because a program like that today
> ***assumes*** the input length
> can't be bigger than BUFSIZ.
>
> This is always *implicitely* assumed and
> nowhere *enforced* by the way. The current
> state implies that catastrophic errors can happen
> if the index starts overwriting separate memory
> areas like the return address...
>
> Everyone knows this. Let's do something to
> stop it. Something simple, without too much
> fuzz.
>
> In this case the compiler generates code that
> in case of index error
> jumps to this label and does what the programmer
> specifies.
>
> The motto of C is that: Trust the programmer.
>
> We have just to allow him/her to specify what to do
> in case of overflow.
>
> Trust the programmer doesn't mean that we trust
> that he never does a mistake of course. It means
> that the programmer can specify what actions
> to take in case of error and provide sensible
> defaults.
>
> Default is then, to finish the program like the
> assert() macro, another useful construct.
>
> Note that this proposal doesn't change anything
> in the language. No new constructs, even if
> compilers could provide arrangements like the
> one proposed above.
>
> I propose then:
>
> #pragma STDC bounds_checking(ON/OFF)
>
> that should be written outside a function scope.
>
> That's all.
>
> This proposal is an invitation to
> brain-storming..:-)
>
> I know that anyone using C is aware of this.
> So, let's fix it.
Sounds like a good idea. Since nothing gets standardised without
someone doing it first, why not implement it in LCC, then see how many
problems are encountered.
Before you do read:
http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html
this is how it was done in TCC.
Perhaps try to make it work the same way as TCC to gets things started
without initial compatibility problems.
- Next message: Mark Shelor: "Re: Mystery: static variables & performance"
- Previous message: Sidney Cadot: "[OT] Re: 'erf' function in C"
- In reply to: jacob navia: "Bounds checked arrays"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|