Re: How are C++ objects laid out in memory ?



WahJava wrote:
> Hi hackers,
>
> I'm investigating on how C++ objects can be accessed and invoked by
> the external code (e.g. a C code, or a assembly language routine, or
> some other language routines). I'm using "Microsoft 32-bit C/C++
> Optimizing Compiler v. 13.10.3052". How C++ class is actually
> laid out in memory ?

This is a very complicated question since the structure is affected by all
of the following idioms (and probably others which I have forgotten to
enumerate):
* Virtual functions
* Inheritance of virtual functions
* Multiple inheritance
* RTTI

> My half correct guess is representation as a structure is
> represented. e.g.
>
> class Msg
> {
> char* msg;
> public:
> Msg(const char*);
> void print();
> ~Msg();
> };
>
> might be represented in C as:
>
> struct MsgStruct
> {
> char* msg;
>
> void (*construct)(struct MsgStruct*, const char*);
> void (*print)(struct MsgStruct*);
> void (*destruct)(struct MsgStruct*);
> };

No. Functions that are not declared "virtual" are statically bound. Look at
this example:

class A
{
public:
void foo()
{
printf("I'm in A\n");
}
};

class B : public A
{
public:
void foo()
{
printf("I'm in B\n");
}
};

void strange(A *a)
{
a->foo();
}

int main(void)
{
B b;
strange(&b);
}

Surprised? It's one of the oddities of C++. Add virtual to both declarations
of "foo" and see how the results change.

Also, assuming you have this class:
class A
{
int a;
virtual void foo();
int b;
};

You *should* under MSVC get this structure:
struct A_VTABLE
{
void (__thiscall *foo)();
};

struct A_STRUCT
{
A_VTABLE *vtable;
int a;
int b;
};

This is what I have observed, at any rate. I don't know if the structure
layout is optimized for certain cases. In the above example, it would make
more sense to include foo directly in A_STRUCT. With RTTI, the vtable
pointer is adjusted so that it looks like this:

struct A_VTABLE
{
void (__thiscall *foo)();
};

struct A_VTABLE_RTTI
{
void *rtti_data;
A_VTABLE vtable;
};

The pointer stored in A_STRUCT remains a pointer to A_VTABLE, not
A_VTABLE_RTTI.

> But the function pointers declared in above MsgStruct structure have
> to be invoked using "thiscall" calling convention (documented in MSDN,
> where "this" pointer is passed in ECX register), and "thiscall"
> convention can't be explicitly. So a tweak will be needed as below:
[...]

This is a bit pedantic. It's almost the same as using __fastcall except that
it is custom-tailored to C++.

> But some of my thoughts contradicts what I've actually derived
> above. That's why I've not used If we've to represent C++ member
> methods as the function
> pointers in C structure, then this means we've to duplicate function
> pointers for each object which also leads to memory wastage. And this
> means, size of C++ object is increased. But size of C++ object remains
> 4 bytes, whereas size of structure instance is 16 bytes (4 bytes data,
> 12 bytes for 3 function pointers).

No. See my comments above. There is one copy of the vtable per type, and it
is statically initialized. This is more efficient with frequent object
creation, but it is less efficient with frequent dynamically-bound (virtual)
method calls.

> Suppose I want to expose a C++ object to some C code, although that C
> code can cast my C++ object to a pointer and can change its data, but
> what about member methods. And is there any standard that controls
> this behavior ? Or every compiler does in its own way ? Then how
> member methods can be invoked ? Is there any table of function
> pointers which I can locate and then invoke the function pointers ?
[...]

This is a question for the C++ newsgroup. I don't know what the standard
does or does not require, but I would *presume* that the limitations on
class layout are lax enough that you can't rely on a specific
representation.

There are 2 portable ways I know of to expose C++ objects to C code:
1) Write your code in C (that is, without classes, not *necessarily* in C
proper) and create C++ wrapper classes for it
2) Write your code in C++ and create C wrapper functions

As to C programs being able to cast your object pointer and access data
directly, that is true, but C++ programs can do it just as easily. Also,
this sort of thinking is a bit futile since nothing would stop me from doing
this:
void russian_roulette(void)
{
char *p = (char *)(size_t) rand();
(*p)++;
}

At some point you have to accept that, if the user wants to do something
stupid, you have no way to stop them. It is good to insulate the user as
best you can, but going to the extreme is pointless.

-Matt

.



Relevant Pages

  • Re: Malloc code
    ... int xxx; ... As for not using the void pointer, I will have to do some further testing ... I just needed some insight on passing arrays of pointers. ... struct MCB *r1; ...
    (microsoft.public.vc.language)
  • Re: Using a link list over an array.
    ... compile (p->data is a void *) so you have not shown us some key ... You can't use it to sort a list of ints where the int is ... You can use it to sort a list of pointers to any type. ...
    (comp.lang.c)
  • Re: void *
    ... list_add_head(&p, (int *)100); ... That was my understanding of 'void *' concept. ... In fact at your level the first thing you should assume when you appear to need a cast is that you have done something wrong and a cast is *not* the solution. ... Personally I don't think you understand the concept of pointers yet, so you need to read the sections of your text book and the comp.lang.c FAQ that refer to pointers. ...
    (comp.lang.c)
  • for my effort: GC (and Dynamic Types), General Spec
    ... it would be included in the gc header and, in truth, a simply 'void *' pointer. ... Pointers may be freely stored on the stack, in global variables, or in other garbage-collected objects. ... int gctypep; ... effort should be used to ensure that the string does not clash with a builtin type or one provided by another library. ...
    (comp.std.c)
  • Re: "Pointers" and "by reference" do my head in!
    ... that way we coul have explained things in the terms you are used to ... > Or am I going to have to bite the bullet and grapple with pointers? ... /* this function takes an int as its argument, ... > void test_function ...
    (alt.comp.lang.learn.c-cpp)