Re: Unions and Structure Questions

From: B. v Ingen Schenau (bart_at_ingen.ddns.info)
Date: 01/18/04


Date: Sun, 18 Jan 2004 22:21:36 +0100

Barry Schwarz wrote:

> On Sat, 17 Jan 2004 09:41:03 GMT, "Anthony Borla"
> <ajborla@bigpond.com> wrote:
>
>>Whereas the members of a struct each occupy a different memory location,
>>the members of a union each occupy the *same* memory location. For
>>example, this struct:
>>
>> struct Product
>> {
>> long productCode;
>> float cost;
>> };
>>
>>occupies:
>>
>> sizeof(long) + sizeof(float) bytes
>
> Plus the size of any padding between the members and after the last
> member.

Correct.

>
>>
>>whereas this union:
>>
>> union Product
>> {
>> long productCode;
>> float cost;
>> };
>>
>>occupies:
>>
>> sizeof(long) > sizeof(float) ? sizeof(long) : sizeof(float) bytes
>
> Plus any padding.

I would not expect any padding in a union, unless two or more members have
incompatible alignment requirements, or unless you count the octets that
don't contribute to the value of the smaller sub-objects as padding.
And if you happen to have a union (or struct for that matter) with types
that have incompatible alignment requirements, don't be surprised if the
compiler adds an increadible amount of padding.
If you have, for example:

union {
 long l; /* sizeof(long)==4, address must be a multiple of 4 */
 long double d; /* sizeof(long double)==10, address must be a multiple of 5
*/
} u;

For this union to be properly aligned for each of its members, it must be
located at an address that is a multiple of 20.
The size of u is neccessarily also a multiple of 20 (most likely, sizeof
u==20), because &u and (&u+1) must both be a multiple of 20. That makes
that 10 padding octets are added to the 10 value octets.

<snip>
>>If, instead, we use a union to describe the variant data:
>>
>> struct DataRecord
>> {
>> char recordType;
>> union Data
>> {
>> struct RType_A ra;
>> struct RType_B rb;
>> } data;
>> };
>>
>>we need only declare:
>>
>> DataRecord record;
>>
>>and read data as before:
>>
>> ...
>> fread(&record, sizeof(DataRecord), 1, stream);
>> ...
>>
>>However, this time, no data movement need take place - the data is already
>>in memory in the exact format required for processing. All that is needed
>>is a check to see what type of data is currently stored:
>>
>> ...
>> if (record.recordType == TYPE_A)
>> ...
>> // Process 'record.data' for TYPE_A
>> ...
>> else if (record.recordType == TYPE_B)
>> ...
>> // Process 'record.data' for TYPE_B
>> ...
>> ...
>>
>>and it can immediately be processed. Simply by using union, memory has
>>been saved, and data movement avoided.
>>
>
> I hope one of the experts weighs in with an opinion on whether the
> approach you describe gets around the limitation in the new C99
> standard that prohibits accessing one member of a union when the data
> was stored in a different member. Since you did not refer to any
> particular member of the union, we could say you loaded the union
> anonymously.

I hope I am considered enough of an expert to answer this. :-)
The restriction on which members of a union you can access at any given time
is not new with C99. My copy of K&R2 also has the restriction that you may
not access another member of a union than the one you have stored the
latest value in (with a few exceptions that also have not changed with the
coming of C99).

To come back to your question, the union that was read from storage is
identical to the one written into the storage (assuming no bits have
toppled over in the mean time). This means that you can assume that any
restrictions in place when the union was written out are now again valid,
with the added restriction that previously valid pointers contained in the
union may have been invalidated in the mean time.
This means that if the union was written out containing a TYPE_A record,
then it now again contain a valid TYPE_A record.

Bart v Ingen Schenau

-- 
a.c.l.l.c-c++ FAQ: http://www.snurse-l.org/acllc-c++/faq.html (currently
unavailable)
a.c.l.l.c-c++ FAQ mirror: http://www.inglorion.com/acllcc++.html
c.l.c FAQ: http://www.eskimo.com/~scs/C-faq/top.html
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/


Relevant Pages

  • Re: Pointing to high and low bytes of something
    ... >implementation-defined behavior (or possibly undefined behavior). ... >Bytes struct), but some articles I've read seemed to imply otherwise. ... padding bytes are inserted only when they serve a *good* ... You don't need the union at all for this purpose: ...
    (comp.lang.c)
  • Re: Global (non _KERNEL) place for sockaddr_union?
    ... The problem is that the definition of the union depends on what you wish ... and which address families are visible to the application ... You may find that such a definition has to have conditionalized members. ... type 'struct sockaddr_storage' if it's to support all address families ...
    (freebsd-arch)
  • Re: Global (non _KERNEL) place for sockaddr_union?
    ... The problem is that the definition of the union depends on what you wish ... and which address families are visible to the application ... You may find that such a definition has to have conditionalized members. ... type 'struct sockaddr_storage' if it's to support all address families ...
    (freebsd-hackers)
  • Re: Global (non _KERNEL) place for sockaddr_union?
    ... The problem is that the definition of the union depends on what you wish ... and which address families are visible to the application ... You may find that such a definition has to have conditionalized members. ... type 'struct sockaddr_storage' if it's to support all address families ...
    (freebsd-net)
  • Re: "common initial sequence" rule = non-obvious constraints on padding?
    ... of any member struct of a union which currently contains such a struct ... anywhere that union type is visible. ... I'm sure that on the DS9000, the linker searches the complete program for unions and adds random padding between structure members in a way that breaks this assumption. ... compatible, you can declare the same object twice, using both types, without breaking the program. ...
    (comp.std.c)