Re: traumatized by pointer casting
From: Jack Klein (jackklein_at_spamcop.net)
Date: 07/10/04
- Next message: Malcolm: "Re: A problem to solve..need help"
- Previous message: Malcolm: "Re: Pass pointers or the things they point to"
- In reply to: tweak: "Re: traumatized by pointer casting"
- Next in thread: tweak: "Re: traumatized by pointer casting"
- Reply: tweak: "Re: traumatized by pointer casting"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 10 Jul 2004 14:49:25 -0500
On Sat, 10 Jul 2004 00:06:30 -0700, tweak <xbwaichunasx@cox.net> wrote
in comp.lang.c:
> Jack Klein wrote:
> > On Fri, 09 Jul 2004 21:18:56 -0700, tweak <xbwaichunasx@cox.net> wrote
> > in comp.lang.c:
> >
> >
> >>Keith Thompson wrote:
> >>
> >>>Frank Cusack <fcusack@fcusack.com> writes:
> >>>
> >>>
> >>>>On 9 Jul 2004 17:45:54 -0700 j0mbolar@engineer.com (j0mbolar) wrote:
> >>>>
> >>>>
> >>>>>A pointer to char is not guaranteed to be
> >>>>>properly aligned for a pointer to struct. In fact
> >>>>>I don't see how this could even work period, even on the
> >>>>>most perverse of systems. I'm right, right? This really
> >>>>>shouldn't work?
> >>>>
> >>>>Right! How can this work?
> >>>>
> >>>>For the folks who didn't get it:
> >>>>
> >>>> char buf[100];
> >>>> struct foo {
> >>>> long a;
> >>>> long b;
> >>>> char c;
> >>>> int d;
> >>>> };
> >>>> struct foo *foo_p = (struct foo *) buf;
> >>>>
> >>>>has the problem that buf has byte-alignment, whereas struct foo has
> >>>>long alignment.
> >>>>
> >>>>foo_p->a is not guaranteed to be referenceable. On some arches it would
> >>>>cause an unaligned read (slow), on some it just wouldn't work at all.
> >>>
> >>>
> >>>It's certainly true that foo_p isn't guaranteed to be aligned to any
> >>>boundary bigger than a byte. On the other hand, a declared array
> >>>object is likely to be word-aligned. (If you count on this, of1
> >>>course, your program will fail at the most inconvenient possible
> >>>moment.)
> >>>
> >>
> >>Now, you have confused me.
> >
> >
> > You are misinterpreting what Keith wrote.
> >
> >
> >>Why would an array be word-aligned (16 bits)? And a structure be
> >>byte-aligned (8 bits)?
> >
> >
> > The array might not be word-aligned. It might have an odd address.
> > When you cast the array by name to a pointer to struct, it would
> > retain that odd address.
>
> it == address of first element in the array?
Yes.
> Let me try to word how I am understanding what you are saying. The
> addresses within the struct mentioned above occupy memory addresses
> determined by their type. For example
>
> a b c d
> [4 bytes] [4 bytes] [1 byte] [2 bytes]
Actually there is a good chance that internal padding would be added
between the members of the structure, which C allows for just this
reason. To keep the discussion simple, assume a platform with 8-bit
bytes, 2-byte (16-bit) ints, and 4-byte (32-bit) longs. If it has
access requirements, that 2-byte int member d will need to be on an
even address. So assuming you define one of these structures and its
address is 0x1000 hex, the compiler will probably allocate it like
this:
0x1000 - 0x1003 member a 4 bytes
0x1004 - 0x1007 member b 4 bytes
0x1008 - 0x1008 member c 1 byte
0x1009 - 0x1009 padding 1 byte
0x100A - 0x100B member d 2 bytes
The padding byte is required because d must start at an even address.
C allows the compiler to insert padding bytes between the members of a
structure, and after the last member, just not before the first
member. Padding is sometimes necessary at the end to make the size of
the entire structure a multiple of the alignment size.
Consider a processor that requires 4-byte, 32-bit longs be located on
an address divisible by 4. Now consider this structure definition and
how the compiler will pad it:
struct silly {
char c1; /* relative address 0x0000 */
/* padding 3 bytes 0x0001 - 0x0003 */
long l1; /* relative address 0x0004 - 0x0007 */
char c2; /* relative address 0x0008 */
/* padding 3 bytes 0x0009 - 0x000b */
};
> In theory, the address pointed to would be the first item in the
> structure, right? Thus, the first item in the structure will always
> have an address divisible by 4. And the first item in a structure
> determines the alignment of the structure?
No, the member of the structure with the strictest alignment
requirement determines the alignment requirement of the entire
structure, as I tried to show in struct silly above. The structure
must have an alignment at least as strict as its strictest member, and
must have padding at the end, if necessary, to make the size of the
entire structure a multiple of this alignment size.
> So to guarantee that the typecast will work, the buf (array) size has to
> have the same divisibility as the first item in the structure?
Yes. That is why the memory allocation functions malloc(), calloc(),
and realloc() are guaranteed to return pointers to memory blocks
suitable aligned for any data type.
>
> Now, the array buf contains 100 bytes each of type char, so the location
> in memory of the first element in the array can have an odd address
> (address divisible by 1).
>
> So when buf is typecasted to struct foo *, the first item pointed to
> long a may not be aligned correctly since &buf[0] can have an odd
> address? Am I following what you are saying okay?
Yes.
> >>I reviewed the draft of C99. And I'm not sure what determines
> >>the boundaries.
> >
> >
> > The implementation determines the boundaries. They must conform to
> > what is physically possible with the underlying processor hardware,
> > but they might be more strict than the hardware requires. There are
> > quite a few modern RISC and DSP processors that have a general
> > requirement that types larger than one byte be aligned on addresses
> > that are a multiple of their size. They don't waste the extra
> > transistors on automatic circuitry to make slower access to unaligned
> > memory, they use them for more useful things and generate a fault or
> > just plain read or write the wrong memory instead.
>
> So char can have odd addresses, int can have addresses divisible by 2,
> long can have addresses divisible by 4 and so on?
>
> I will write a program tomorrow with intentional mis-alignment, so that
> I can debug it and see the mis-alignment. I hope I can do this with
> gdb.
>
> Brian
>
> P.S. The FAQ is brilliant. I wish I knew C as well as you.
What happens when you violate alignment requirements falls into the
category that the C standard calls undefined behavior. It is like
dividing by 0 in mathematics, you have broken the rules and there is
no correct answer.
In the actual hardware world there are three likely things that will
happen, depending on the type of the underlying processor hardware:
1. Example Intel Pentium. The processor will perform multiple memory
accesses, if necessary, and shift the bytes around between memory and
the processor core. The performance penalty can be quite severe.
2. Example ARM. The processor hardware generates a hardware
exception that causes the operating system (if there is one) to
terminate the misbehaving program.
3. Example Intel 8096. The misalignment is ignored, with incorrect
results. If you have a pointer of 0x21 and try to read or write a
16-bit int at address 0x21 and 0x22, the processor actually reads or
writes at the aligned address, bytes 0x20 and 0x21.
-- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
- Next message: Malcolm: "Re: A problem to solve..need help"
- Previous message: Malcolm: "Re: Pass pointers or the things they point to"
- In reply to: tweak: "Re: traumatized by pointer casting"
- Next in thread: tweak: "Re: traumatized by pointer casting"
- Reply: tweak: "Re: traumatized by pointer casting"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|