Re: Is a[i] = i++ correct?



On Dec 28, 8:37 am, c...@xxxxxxxx (Richard Harter) wrote:
On Thu, 27 Dec 2007 12:13:10 +0000, Richard Heathfield

<r...@xxxxxxxxxxxxxxx> wrote:
jeniffer said:

Hi

I want to know why is  a[i] = i++ ; wrong?

What do you think it should mean? Given this code:

int a[3] = { 5, 7, 9 };
i = 0;
a[i] = i++; /* bug */

which member of a[] do you think will be updated, and to what value?

If C used a left to right order of application

Then it would be a better safer language.

similar to that
for arithmetic (with the as-if rule as a back door) then the
results would be well defined.

But it isn't and so they are not. Your point is?


 After the statement a[0] would be
0 and i would be 1.  Similarly, the statements

i=0;
a[i++] = ++i + i++;

would evaluate as follows:

The target of the assignment is a[0].
i is incremented after computing the location to become 1.
On the RHS i is incremented to become 2. (++i)
i is added to i to produce 4; once the addition is completed i is
incremented to become 3.  (i++).

Of course C does not guarantee the order of evaluation except in
special cases, and it is important to understand that it does
not.  One can argue that not guaranteeing the preservation of
code order is a design flaw in the C language, but it doesn't

I would agree. Today, it's no longer a good engineering tradeoff.

What the loose order of evaluation buys you is the ability to optimize
code whose operands are accessed through indirection that can't be
analyzed at compile time.

Even the order of evaluation is well-defined, you can still optimize
code like

a[j] = i++;

quite nicely. The compiler still can change the order of actual
evaluation to make it run fast on the given CPU, because the objects
a[], i and j are distinct, non-overlapping. And, also, they are not
volatile objects. So the order in which anything takes place is not
externally visible behavior. Only the correctness of the end result
matters. It doesn't matter whether a[j] receives the value first, or
whether i receives the value first.

However, if you have indirection, like:

a[*p] = (*q)++

then the order matters. In C the way it is, this is undefined if p and
q point to the same memory location. But if they point to different
integers, then it's well-defined! In the general case, it is only
known at run-time whether p and q are aliased. Because of the
undefinedness of the behavior if p and q are aliased, the compiler
doesn't have to care about that case, and can generate code to do it
in any arbitrary order.

If you make the order well-defined, then the compiler has to work with
the suspicion that p and q may be the same object. That of course
affects code generation decisions. If p and q are never in fact
aliased, then that code may be less than optimal.

In modern C, we now have the "restrict" qualifier which makes code
undefined when pointers are aliased. I.e. in a C language dialect
which is like C99, but in which evaluation order is well-defined, we
could still get the undefined behavior of p and q being overlapped,
like this:

int *restrict p, * restrict q;

/* ... point them to the same thing ... */

a[*p] = (*q)++;

The compiler can assume that p and q are not aliased and optimize the
code accordingly.

Loose evaluation order is merely an optimization crutch which was
needed before restrict qualifiers were introduced.

Speaking of optimization crutches, ultimately, what would be a good
solution would be the ability to define optimization parameters over
specific blocks of code. Suppose you had a way to express the idea
``over this block of code, please use classic loose evaluation
order''. You could have the safety benefit of well-defined order
throughout most of the program, as well as the optimization benefits
of loose order in hotspots.

So basically, the argument that loose evaluation order is a necessary
design decision for good code generation simply doesn't hold water.
It's true with regard to 1970's compiler technology, if even that.

matter - C is cast in stone.

C is not cast in stone. Past undefined behaviors can easily be defined
in the future, without breaking any correctly written code.
.