Re: 'size_t' and Array Limits (From: ptrdiff_t maximum)



Shao Miller <sha0.miller@xxxxxxxxx> writes:
Keith Thompson wrote:
Shao Miller <sha0.miller@xxxxxxxxx> writes:
Keith Thompson wrote:
Shao Miller <sha0.miller@xxxxxxxxx> writes:
[...]
In C89, I believe we have in A.6.3.6 that 'size_t' is "The type
of integer required to hold the maximum size of an array".
What document are you quoting? In the C90 standard, Annex A is
the bibliography; there is no A.6.3.6. C90 6.3.6 covers additive
operators; it doesn't mention size_t. I don't see the quoted
sentence in either edition of K&R, though I haven't searched
thoroughly; neither has an A.6.3.6.
I was quoting an ANSI C Standard draft (ANSI X3J11/88-090). If
that didn't make it into the ANSI C Standard X3.159-1989, then we
can remove the quotation marks and instead offer that it simply
follows from the 'sizeof' operator that 'size_t' must be the
integer type required to hold the maximum size of an array.

Ok. The C90 standard has this in G.3.7.

G.3 says:

Each implementation shall document its behavior in each
of the areas listed in this subclause. The following are
implementation-defined:

G.3.7 Arrays and pointer:

The type of integer required to hold the maximum size of an
array -- that is, the type of the sizeof operator, size_t
(6.3.3.4, 7.1.1).
Thanks for pointing out the correct section in the actual C90 Standard.

The corresponding section in C99 doesn't mention size_t. In fact,
the C99 section on implementation-defined behavior doesn't mention
size_t at all (which it probably should).
Oops! That might be a step backwards... Could it have been
intentional, I wonder?

C90 7.1.6 merely says that size_t is "the unsigned integral type
of the result of the sizeof operator", with no mention of arrays.
C99 7.17p2 has the same wording.
So perhaps we follow along to 'sizeof'. Are arrays mentioned there?

No.

[...]

size_t isn't specifically related to arrays; it's the result of
sizeof, which can be applied to any object type.
This might be perceived to be a little bit related to
bounds-checking. Here we have the notions of "array," "array type,"
and "array object." We know[1] that we can define an array object with
a declaration, but do array objects (including multi-dimensional array
objects) deserve different treatment than "objects?"

IMHO, no, they don't.

Note that, for purposes of pointer arithmetic, any object can be treated
as a single-element array.

If all objects have an object representation[2], does this include
array objects?

Of course; why wouldn't it?

(It's also not 100%
clear that an array can't be bigger than size_t bytes.)
Heheh. I agree. Earlier I gave code which attempts to work with an
"array" (whatever that means) via pointer arithmetic beyond the limits
of 'size_t'.

I think it would be reasonable for one person to argue that a C array
object cannot have a size greater than the greatest value that can fit
within a 'size_t' and for another person to argue that 'size_t' and
array objects have no connection. :) Why? Because whether or not an
object is an array object seems to be relative to which part of the
Standard is used as qualifier.

I think you're overcomplicating this. Whether an object is an
array object simply depends on whether it's of array type. On the
other hand, as I mentioned, any object can be treated, for certain
well-defined purposes, as a single-element array object.

void *vp;
vp = calloc(SIZE_MAX, 2);

Or:

size_t sz = sizeof (char[SIZE_MAX][2]);

In the first case, any sane implementation will set vp to a null
pointer. In the second, an implementation can reject the type
char[SIZE_MAX][2] because it exceeds a capacity limit. If it
accepts it, since sizeof is an operator, I'd argue that the same
rules regarding overflow apply to it as to any other operator; if
the result would exceed SIZE_MAX, since size_t is an unsigned type,
the result is reduced modulo SIZE_MAX+1. I'm not particularly
happy with that interpretation.

Any sane implementation will simply make size_t big enough to
hold the size of any object it can support, and avoid the issue
altogether. The question (which is almost entirely theoretical
rather than practical) is whether the standard requires (or *should*
require) implementations to be sane.

malloc() already can't create objects bigger than SIZE_MAX bytes.
calloc() potentially can; the standard could be tweaked to require
it to fail if the mathematical product of tis two arguments exceeds
SIZE_MAX. An attempt to declare an object or type whose size
exceeds SIZE_MAX could be made a constraint violation. VLAs are
slightly tricky; probably all you could do is make the behavior of
huge VLAs undefined.

Yikes. But it might be pleasant to work with such a large "array"
anyway. It's an example of asymmetry in the Standard, in my
opinion. We can attempt to construct an array with an arbitrarily
large size and we can attempt to increment a pointer to point to an
arbitrarily high element, but 'sizeof' yields 'size_t', which has a
finite set of values and pointer subtraction yields 'ptrdiff_t', which
has a finite set of values.

The C99 environmental limit for "bytes in an object" (hosted)[3] is
consistent with the minimum requirement for 'SIZE_MAX'[4].

Is the behaviour of these [clipped for brevity] code examples one or
more of: unspecified, implementation-defined, undefined? Or do we
additionally need "unknown"?

References from the "C99" C Standard draft with filename 'n1256.pdf':
[1] 6.5.2.1p4 "...Consider the array object defined by the declaration..."
[2] 6.2.6.1p4 "...any other object type..."
[3] 5.2.4.1p1 "...bytes in an object (in a hosted environment only)..."
[4] 7.18.3p2 "...limit of size_t..."

--
Keith Thompson (The_Other_Keith) kst-u@xxxxxxx <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
.