Re: Memory layout in unions



qarnos <darrencubitt@xxxxxxxxx> writes:
I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

If you use a conforming compiler, your only guarantee is that you'll
get a diagnostic message. If you don't get one, your compiler is not
conforming, or at least you didn't run it in a conforming mode. (gcc
is not conforming by default; try "-ansi -pedantic".)

Standard C does not allow anonymous struct members.

But let's make it non-anonymous (that wasn't your main point anyway):

union my_union {
unsigned int ccount[2];
struct {
unsigned int rcount;
unsigned int lcount;
} foo;
};

union my_union obj;

The standard guarantees that members of a struct (other than bit
fields) are laid out in the order in which they're declared, that the
first member of a struct is at offset 0, and that each member of a
union is at offset 0. Thus obj.ccount[0] and obj.foo.rcount are
guaranteed to occupy the same location.

Compilers are allowed to insert arbitrary padding between struct
members and/or after the last member. Normally this is done for
alignment purposes, but the standard doesn't restrict it; a perverse
compiler could insert as much padding as it likes. I don't think it's
possible for padding between rcount and lcount to be necessary for
alignment purposes, so obj.ccount[1] and obj.foo.lcount almost
certainly occupy the same location, but the standard doesn't actually
guarantee it.

Furthermore, though unions are commonly used to treat a given chunk of
memory as if it were of two different types, the standard doesn't
actually support this usage except in a few cases. Storing a value in
one member of a union and then reading a value from another member is,
in most cases, undefined behavior. It's a common enough usage that
any compiler will probably let you get away with it, but even if the
obj.ccount[0] and obj.foo.rcount occupy the same location, an
optimizing compiler could theoretically rearrange the code so that it
doesn't behave that way. For example:

int n = 42;
printf("%d\n", n);
/* The generated code could use a literal 42 rather than
re-loading the value of n */

/* declarations as above */
obj.foo.rcount = 42;
obj.ccount[0] = 137;
printf("%d\n", obj.foo.rcount);
/* The generated code could use a literal 42 rather than
re-loading the value of obj.foo.rcount. Since the value must
be 42 unless you've done something that invokes undefined
behavior, this is a valid optimization. */

*But* there's a lot of code out there that does this kind of thing,
even though the standard doesn't support it, and it's unlikely that a
compiler vendor is going to break such code.

Having said all that, there is a way to do what you want that's fully
supported by the standard:

struct my_struct {
unsigned int ccount[2];
};
#define rcount ccount[0]
#define lcount ccount[1]
struct my_struct obj;

Now obj.rcount actually *means* obj.ccount[0], and obj.lcount means
obj.ccount[1].

--
Keith Thompson (The_Other_Keith) kst-u@xxxxxxx <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
.



Relevant Pages

  • Re: Why is C Standard Code Example Invalid?
    ... JK> that a declaration of the complete type of the union is ... the safest bet is to take the standard at it's ... struct t1; ... "E.MOS", which would suggest that the implicit dereference occurs, but ...
    (comp.std.c)
  • [PATCH] netfilter endian regressions
    ... struct xt_connlimit_info { ... unsigned int limit, inverse; ... union nf_conntrack_man_proto ... static inline bool already_closed ...
    (Linux-Kernel)
  • Re: Unions in structures
    ... so I wondered if this was part of the standard. ... struct thing { ... unsigned int willingFlag: 1; ... The "dubious" part here is that the compiler has a lot ...
    (comp.lang.c)
  • strict aliasing rules in ISO C, someone understands them ?
    ... I try to understand strict aliasing rules that are in the C Standard. ... - an aggregate or union type that includes one of the aforementioned ... Let's have two struct having different tag names, ... struct s1 {int i;}; ...
    (comp.lang.c)
  • Re: Leading padding in unions
    ... The Standard requires unions not to have leading padding bytes. ... while nothing said for an union object. ... struct and an union too roughly in common when drafting the wording ...
    (comp.std.c)