Re: Basic questions

From: Arthur J. O'Dwyer (ajo_at_nospam.andrew.cmu.edu)
Date: 07/02/04


Date: Fri, 2 Jul 2004 09:38:18 -0400 (EDT)


On Fri, 2 Jul 2004, Andre Heinen wrote:
>
> On Fri, 02 Jul 2004 05:23:23 GMT, Busin wrote:
> >
> >int a[100];
> >Are all elements of "a" array consecutive in memory for sure?
>
> Yes in the sense that if you have
> int* p;
> and if p points to any element of the array except the last one,
> then p+1 points to the next element.
>
> However I think the compiler is allowed to let unused bits hang
> around between the elements, for hardware optimization purposes
> (alignment). Not sure, though.

  The compiler is allowed to insert padding bits wherever it likes
within "normal" object types. However, it is *not* allowed to put
padding between array elements. That is, sizeof a == 100*sizeof(int)
in all cases, but the compiler *is* allowed to make 'sizeof(int)'
bigger than strictly necessary, if it will help it to produce
effective code.

> >Question 2:
> >
> >long a;
> >short b, c;
> >a = b << 16 + c;
>
> There are two bugs here. The precedence of + is higher than the
> precedence of <<. Therefore what you wrote is actually
> a = b << (16+c);
> You should use explicit ():
> a = (b<<16) + c;
>
> Also, you should cast b to a long int:
> a = ((long)b << 16) + c;
> As 16 is an int, what you wrote is equivalent to
> a = ((int)b << 16) + c.
> If, on your system, sizeof(int) is 2, b<<16 will always be zero.

  Nit: If INT_MAX is less than 65536, b<<16 will always be zero.
sizeof(int) might contain some of those aforementioned padding bits
(thus lowering the width of int), or CHAR_BIT might be greater than
8 (thus raising the width of int).

  Shifting a signed number "off the left end" of the data type is
a source of undefined behavior, and ought to be avoided in any C
program. The fix is to use unsigned integer types for all bitwise
operations:

    unsigned long a;
    unsigned short b, c;
    a = ((unsigned long)b << 16) + c;

This modified code has well-defined behavior on any platform, as
far as I know.

> Firstly, you should be aware that this code may not work at all.
> Depending on your system, shorts and longs may have the same
> size. The only guarantee C++ gives you is that
> 1 <= sizeof(short) <= sizeof(int) <= sizeof(long)
> In C, I don't know for sure.

  In both C and C++, another guarantee is that 'short' has at least
16 bits, and so does 'int', and 'long' has at least 32 bits. So
if the OP uses 'unsigned short' only to store 16-bit numbers (writing

    result = result & 0xFFFF;

where necessary), then the code will always do what's expected, as
far as I can tell.

> For more safety, you can add these lines in your source file:
> assert(sizeof(short)==2);
> assert(sizeof(long)==4);

  I would consider this bogus "safety." If you're really worried
about portability, you'll take the extra care to make sure your
'short' values stay within 16 bits. The above tests will give
misleading answers on systems with padding bits or uncommon CHAR_BIT
values, so they're not a panacea for portability, and they might trick
someone else into thinking the code *was* portable.

> Secondly, I doubt that a = ((long)b << 16) + c
> is faster than a = (long)b * 65536 + c
> Your compiler will probably optimize it and generate the same
> code in any case.

  True, but I think the former is more clear, in this case. :)

> >Does signed or unsigned type of a and b affect the calculation?
>
> If you write x << 16, the result will be the same no matter
> whether x is signed or unsigned.

  ...unless the signed result overflows, in which case who knows?
Signed arithmetic in C (and C++) is allowed to do all kinds of
useful things, from "clamping" at 2**31-1, to producing a kind of
"Not a Number" trap value, to triggering an exception. [If the
signed result does not overflow, the two will be the same.]

> If you write x >> 16, the result can be different.

  ...Namely, the result for negative signed numbers has implementation-
defined aspects. For positive signed numbers, the result is the same
as for the same unsigned number.

> One last thing: it may be important for you to know that "highest
> 16 bits" doesn't always mean "leftmost 16 bits" or "first address
> in memory". Some machines store the highest bytes first, and
> others store them last.

  Well, I'd say it *does* mean "leftmost 16 bits"; it's just that
some machines don't usually store the leftmost bits at the lowest
memory address. In C, the left-shift operator << *always* moves
bits to higher-order positions, and right-shift >> *always* moves
bits to lower-order positions. (IOW, a>>1 == a/2 and a<<1 == a*2
for all suitably portable values of a.)

[A very complete answer, by the way; I'm just nitpicking, mostly.]

-Arthur



Relevant Pages

  • Re: minimum and maximum values of an object of type int
    ... values from +32768 to +2147483647 are trap representation ... is set to zero, fifteen value bits all set to 1, and 16 padding bits ... the int using an unsigned char*. ... You store 32767 ...
    (comp.lang.c)
  • Re: minimum and maximum values of an object of type int
    ... values from +32768 to +2147483647 are trap representation ... you have 16 padding bits. ... By accessing the four bytes of the int using an unsigned char*, ... You store 32767 ...
    (comp.lang.c)
  • Re: minimum and maximum values of an object of type int
    ... values from +32768 to +2147483647 are trap representation ... you have 16 padding bits. ... By accessing the four bytes of the int using an unsigned char*, ... You store 32767 ...
    (comp.lang.c)
  • Re: Adding large numbers in C
    ... one of the numbers - or perhaps the result - is too big to store in an int. ... you have bitstrings longer than 8 bits, simply use an array of unsigned ... Incidentally, the subtraction routine does similar juggling, so if M and N ...
    (comp.lang.c)
  • Re: Boost process and C
    ... Using size_t is also not any more *portable* than using int. ... of portability is merely a reflection of the lack of the intrinsic ... I've gotten plenty of feedback over its lifetime which has ... "Cursed with non-portabilities" indeed ... ...
    (comp.lang.c)