Re: Unions, storage, ABI's

Jens.Toerring_at_physik.fu-berlin.de
Date: 02/10/05


Date: 10 Feb 2005 16:27:19 GMT

Koen <no@ssppaamm.com> wrote:
> Does anyone know what the standard says about the way unions are
> stored in C? I mean the following:

> Let's say you have a union with a double and a char field:

> union MyUnion_t
> {
> double Double;
> char Char;
> };

> MyUnion_t aUnion;

You need

union MyUnion_t aUnion;

here since MyUnion_T isn't typdefed but just a tag...

> Is it standardized somehow which byte of the allocated storage the
> Char field will use?

What you cited later, i.e.

>> ... A pointer to a union object, suitably
>> converted, points to each of its members (or if a member is a bit-
>> field, then to the unit in which it resides), and vice versa.

tells that both the 'Double' and 'Char' members will be at the very
first address of the union. I.e.

          &AUninon.Double == (double *) &AUnion
          &AUninon.Char == (char * ) &AUnion

That's what is meant by that sentence, i.e. you can use &AUnion after
suitable casting either as a pointer to the 'Double' or to the 'Char'
member (if doing so makes much sense is a different question).

> And a related question: if you dump unions in binary form to a file,
> and then reload them from the file on a different platform, or with a
> program compiled by a different compiler, are you guaranteed to get
> back what you stored? (I think not, but I'm not sure)

On a different platform this will definitely not be guaranteed to
work - the floating point format can be completely different (already
sizeof(double) may differ) and also the number of bits in a char could
be different (i.e. the value of CHAR_BIT from limits.h>) as well as
the encoding. Even when compiled with a different compiler you could
theoretically get in trouble if the use different numbers of bits to
store floating point numbers, but I would guess that this is an extre-
mely unlikely case.

> And another point: if you use the same union across different ABI's,
> will that work without problems? For example: if you get a pointer to
> such a union from a library compiled by a specific compiler, and then
> use it in another module compiled with a different compiler, will
> aUnion.Char work as expected?

As a rule I would expect it to work since a lot of things might break
if different compilers use different chars or doubles (you probably
wouldn't be able to link at all in that case since the libraries would
need t use a different libc). But I don't think there's a promise in
the standard that it will always work.

> Someone on the comp.std.c group found this:
>> a union object at any time. A pointer to a union object, suitably
>> converted, points to each of its members (or if a member is a bit-
>> field, then to the unit in which it resides), and vice versa.

> That doesn't seem to guarantee much does it?
> I was hoping to use (*ptr).Char or (*ptr).Double without having to cast
> anything. Looks like no more details are specified about whether this is
> supposed to work across different ABI's:

No, you don't need a cast. The sentence just says that you can use the
address of the union after suitable conversion as a pointer to each
of its members, which guarantees that each member lays at the start
of the structure. But there's nothing wrong with using 'ptr->Char' or
'ptr->Double' without the cast - by specifying the member you already
tell the compiler which type is meant. So 'ptr->Char' is a char and
'ptr->Double' a double without any casts.

> MyUnion_t theUnion;
> somelib->getUnion(&theUnion);

> where this is in my own program, and somelib is a function table into a
> library possibly compiled by another compiler.
> Will the libary return a union with the same binary layout as my own
> program would do?

As I wrote above, it's very likely to work, but I don't see that there's
a guarantee. But if it doesn't work I would expect things to fail at the
linking stage since such differences would rather likely lead to a lot of
trouble all over the place.
                                   Regards, Jens

-- 
  \   Jens Thoms Toerring  ___  Jens.Toerring@physik.fu-berlin.de
   \__________________________  http://www.toerring.de


Relevant Pages

  • Re: Copying from one struct to another, simple assignment?
    ... void set_char(char *ptr, char value) ... union u x; ... The compiler might zero the excess bytes of x when assigning '*' to ... it's assigning a value to a union member. ...
    (comp.lang.c)
  • Re: standard memory allocator alignment issue...
    ... in the struct, minus one for the `char' element, minus the ... of padding bytes *after* the union. ... compiler where that final term would be non-zero, ... I am trying to come up with somewhat "portable" hack that can attempt to ...
    (comp.lang.c)
  • Re: Application of Union in C
    ... is to use a `union` with a `char` array: ... Reading out values of `b` is always OK, as `char` cannot have ... It's much better to use unsigned char rather than char (which can be ... storing a value in one member of a union and then reading ...
    (comp.lang.c)
  • Re: Setting union member in structure
    ... >> integers and generating compiler errors. ... > 15 When a union is initialized with a brace­-enclosed initializer, ... > the braces shall only contain an initializer for the first member ...
    (comp.lang.cpp)
  • Re: Application of Union in C
    ... `float`/`double` is to use a `union` with a `char` array: ... Reading out values of `b` is always OK, as `char` cannot have ... You're right about the `unsigned char`, ... another member is dangerous unless the member you read is an array of ...
    (comp.lang.c)