Re: Unions Redux



On Thu, 15 Mar 2007 05:55:15 GMT, Yevgen Muntyan
<muntyan.removethis@xxxxxxxx> wrote in comp.lang.c:

Jack Klein wrote:
On 14 Mar 2007 15:10:44 -0700, "Old Wolf" <oldwolf@xxxxxxxxxxxxxx>
wrote in comp.lang.c:

Ok, we've had two long and haphazard threads about unions recently,
and I still don't feel any closer to certainty about what is permitted
and what isn't. The other thread topics were "Real Life Unions"
and "union {unsigned char u[10]; ...} ".

Most of the rambling was caused by the original OP, I think, rather
than the material. I am not criticizing, just observing.
[snip]
There is no difference in aliasing in a union than there is via
pointer casting.

Sorry if it's something obvious or stupid, but please consider this
(no pointers involved).

You are mistaken, of course there are pointers involved. Just not
pointer objects.

Suppose double is eight bytes big, int is four bytes, there are
no padding bits in int.

/* (1) get bits from a double and see what happens */
double a = 3.45; unsigned int b;
memcpy(&b, &a, sizeof b);

There are two pointers used in the function call statement. The &
operator generates two addresses, which are passed to memcpy() as
pointers to void.

printf("%u", b);

The behavior is undefined.

/* (2) do same thing using a union */
union U {double a; unsigned int b;} u;
u.a = 3.14;
printf("%u", u.b);

The behavior is undefined.

/* (3) initialize union with memcpy and access its member */
double d;
union U {double a; unsigned int b;} u;
d = 3.14;
memcpy(&u, &d, sizeof d);
printf("%u", u.b);

The behavior is undefined.

Which of three are valid? I think (1) is; (3) maybe; (2) maybe, if
(3) valid and aliasing rules don't work here. If aliasing rules
do apply to (2), then how is first assignment in (2) different
from memcpy() in (3)?

None of the three are valid. The standard does not give you
permission to access an lvalue of type unsigned int after writing some
or all of the bits of a double to it.

It's in fact the same question as in the other post by OP, about
aliasing. Perhaps all such magic is allowed and that's it. Perhaps
I'm just stupid that I can't understand these simple things.

I am beginning to wonder if you are being deliberately obtuse. The
only "magic" that is allowed unconditionally is that any object, or
any block of memory belonging to a program, may be read as an array of
unsigned characters without invoking undefined behavior.

There is often a need, even in strictly conforming C code, to access
raw memory regardless of the type of objects it contains. A simple
operation like generating an MD5 signature for a file, for example.

Unsigned char is the one and only "raw" data type in C. It is the one
and only type allowed to read the bytes of any object, and it cannot
trap because every possible combination of bits in a byte is a valid
representation of an unsigned char value.

Read the first five paragraphs of 6.2.6.1 about representation of
types, and the terms "object representation", and "trap
representation".

You, and many other C programmers, seem to think that a union has some
magical properties. It does not. Any aliasing of types you do with a
union can also be done via explicit pointer conversion and
dereference, or implicit pointer conversion as memcpy() does.

If the aliasing is defined by the standard in any one of the three
situations, it is well-defined for all of them. If it is undefined,
but works in an implementation-defined manner on your particular
platform, it will work the same way no matter what method you use to
actually alias it.

There is nothing magical about a union when it comes to type punning,
and all three of your examples cause undefined behavior because they
are omitted from the inclusive list of allowed aliasing in 6.5p7.
There is no mention of reading some or all of the object
representation of a double with an lvalue of type unsigned int.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
.



Relevant Pages

  • Re: storing unsigned in void*
    ... >> Arithmetic on a null pointer invokes undefined behavior, ... A union merely means the objects ...
    (comp.unix.programmer)
  • Re: union {unsigned char u[10]; ...}
    ... But character type is not a union. ... u.a is of type int. ... has to do so to make pointer equality work consistently). ... were a single-element array. ...
    (comp.lang.c)
  • Re: union {unsigned char u[10]; ...}
    ... But character type is not a union. ... has to do so to make pointer equality work consistently). ... objects which are not members of some aggregate, ... were a single-element array. ...
    (comp.lang.c)
  • Re: Initializer element not constant
    ... unsigned int, which I assume is 32 bits, would cause a nasty run-time ... problem, since first if any of the upper 32 bits of the pointer are 1, ... an object of this type with static storage duration you are trying to ... So your compiler, with one set of options, is quite free to reject ...
    (comp.lang.c)
  • Re: union {unsigned char u[10]; ...}
    ... But character type is not a union. ... u.a is of type int. ... has to do so to make pointer equality work consistently). ... were a single-element array. ...
    (comp.lang.c)