Re: C Stack Corruption?

From: Gordon Burditt (gordonb.duj8m_at_burditt.org)
Date: 03/03/05


Date: 02 Mar 2005 23:46:50 GMT


>I'm working with some IBM sponsored C APIs to interface with a
>corporate legacy system, and it seems that I'm getting some stack
>corruption after any API call.

There is no guarantee that you can mix the outputs of two different
compilers together, particularly if they use different linkage
conventions for passing and retrieving arguments. In the worst
case, one compiler won't even recognize the other's object code as
object code. You don't seem to have that problem, but stuff like
how stuff gets pushed on the stack (or put in registers), where to
look for return values of various types, which registers are saved
or trashed across function calls, etc. may be an issue.

One possibility to consider is a "glue" routine, written in assembly
language. It accepts a call from one compiler module, re-arranges
the arguments the way the other module wants it, then accepts back
a return value and passes it back to the first compiler modules.
You might need a lot of "glue" routines here: possibly one per
function on either side.

>I believe the APIs were designed to be used with IBM VisualAge, but I'm
>compiling my program with both gcc/cygwin and Microsoft's Visual C++
>compiler. While running I actually get different behavior with each
>compiler. To link to the library with cl.exe I generated a .lib file
>from the .dll file using a small open source application (script) which
>I can't seem to find online anymore (google reindex?).
>
>I can get the program to continue to run for a while by adding a char
>x[10000]; buffer as the last local variable in my function, but
>eventually I get a core dump. Stepping through the application in gdb
>I see my char buffer x go from '\0' (repeats 9999 times) to garbage
>values after any IBM API call. Commenting out x yields an immediate
>crash at best, or overwrites other necessary local variables which
>wreaks bloody havoc on subsequent calls.

One screwup I've seen with stack imbalance on gcc happens when the
function thinks it's returning a double but the caller thinks it's
returning int. Blam! floating point unit stack overflow. The real
problem here is the code is broken (and it could be fixed by putting
a declaration in a header file and using it where needed). I think I've
seen a case where the C stack got trashed, but I don't remember the
situation. Again, bad code, most likely caller and callee not agreeing
on the types of stuff.

Warnings like those produced by gcc -Wall are your friend. Try
fixing these first, especially those that say "implicit declaration
of function ______".

>I've checked and double checked the parameters going into and coming
>out of the API calls, and they appear to be correct. I've also
>initialized all local variables.
>
>I highly doubt that the IBM API is at fault, and I'm running out of
>ideas to test my code.

If you're using two different APIs on the calling and called side,
that's probably a mistake, whether or not either API is "wrong",
whatever that means.

>Is it possible that it's in how I compile my
>code and link to the DLL? This is running as JNI code for a larger
>Java application, so I'm using ANT to build (the behavior still
>exhibits itself when all JNI code is removed and the program is run as
>a c console application). Below are my build targets.

If possible, try compiling with the same compiler as the proprietary
dll.

>Does anyone have any further suggestions for debugging a gcc compiled
>application linked to a proprietary dll in Windows? Is there any way
>to get an easy printout of the stack at any given point in time?
>
>Unfortunately, I do not own my code, so code samples can't be posted.
>Just imagine horribly named API methods that take pointers to pointers
>to deeply nested structs of pointers. =)
>
>All help is greatly appreciated. My coworkers and I are starting to
>bang our heads against the wall.

You're probably not going to get much useful help if you can't show
us the code. I can only speculate in generalities, which roughly
comes down to "undefined behavior -> shit happens".

                                                Gordon L. Burditt



Relevant Pages

  • C Stack Corruption?
    ... and it seems that I'm getting some stack ... corruption after any API call. ... I believe the APIs were designed to be used with IBM VisualAge, ... compiler. ...
    (comp.lang.c)
  • Re: [RFC][PATCH 0/6] mm, highmem: kmap_atomic rework
    ... static slot based. ... course its a big massive patch changing a widely used API. ... We don't have any checks in there for the stack overflowing? ...
    (Linux-Kernel)
  • Re: hpgcc3 API
    ... Will the new API allow a C program to store a value into a calculator ... E.g. if I put 123 on the stack and run this program, ... IERR to the stack and have your wrapper save it to IERR? ...
    (comp.sys.hp48)
  • Re: Need help in accessing variables from a caller stack
    ... we have an API ... be on the stack. ... compiler uses "cdecl" calling convention, ... be able to locate saved ebp pointer for "CallAnotherFunction" on the stack, ...
    (comp.lang.asm.x86)
  • RE: BthGetBasebandConnections return value
    ... This would return ERROR_SERVICE_NOT_ACTIVE if the stack has not been ... this is not the best API for you to use. ... If you want to know if there are currently baseband connections present ... then you should also call BthGetBasebandConnections (or ...
    (microsoft.public.windowsce.platbuilder)