Re: malloc under linux



In article <lnd4vvadgo.fsf@xxxxxxxxxxxxxxx>,
Keith Thompson <kst-u@xxxxxxx> wrote:

It's been said in this thread (sorry, I don't recall by whom) that
studies had shown that many Unix programs allocate large amounts of
memory that they never use.

But I'm rather skeptical of the claim. It seems to me that a
well-written C program will malloc() only the memory that it actually
needs.

It's true that fork() can result in large chunks of virtual memory
space that are never used, since the child process typically replaces
itself with another executable via one of the exec*() functions. But
that's different from malloc().

Can anyone cite an actual study that says that Unix programs (or C
programs under any OS) often allocate much more memory than they
actually use?

Sorry that my posting was not clearer. I was referring to the
fork() behaviour; fork() is the -only- way to start a process in
POSIX.1 (vfork() is not standard POSIX). Every process starts off
as a clone of its parent, with pretty much just the process ID
differentiating the two. If malloc() stores its overhead within
the process memory, then all the information about what was malloc()'d
gets duplicated virtually (because all the memory pages must read out
the same); if malloc() stores its overhead outside the process memory,
then all the information about what was malloc()'d must be duplicated
by the OS (e.g., free() from the child must have the same result
as free() from the parent); either way, the malloc()'d memory must
get duplicated. At some meta level we can differentiate
between malloc() and fork() behaviour, but the effect at the process
level has to be the same as if all the malloc()'s had happened, so
the meta difference becomes moot for the purpose of this discussion.

Once the new process memory is allocated as a clone of its parent,
then the new process generally, as you noted, exec()'s off a different
binary, with the implicit freeing of the malloc()'d memory -- but
the exec()'d binary retains the same process container (and some
attributes of the old executable are inheritted, including some I/O
linkages). In-between the process essentially "gives back" all that
malloc()'d memory and starts afresh with a new malloc() arena and
likely with completely different memory page attributes (about which
pages are readable, writable, executable, etc.)

Once started afresh, processes are often fairly well behaved on memory
allocation... until, that is, they need to system() or popen() or the
like, at which point the OS becomes a virtual wastrel because of the OS
assumption that copies of what has gone before will be needed in both
the old and new processes.

Even so, it is (or was) not uncommon for processes to allocate
large blocks of memory, enough to hold the maximum problem size
they are configured to handle. realloc() does exist, yes, but has
the problem that the memory might move, with all the intendent
hastles of updating all the pointers into the memory block.

Sometimes programmers start out by allocating as much memory as
they might ever need, preferring to immediately catch the
low-memory condition and just not go very far, rather than
finding out 15 levels down 12 hours in that there isn't enough
memory to go further. The program might try to "checkpoint" then,
but checkpointing usually requires a block of working memory, so
it is usually better to checkpoint while you know you still have
enough memory, and then just die gracefully if memory runs out.
And then there's the hybrid strategy of allocating a large checkpoint
buffer at the beginning and leaving it untouched against the
eventuality of running out of memory and needing to checkpoint.

--
"Any sufficiently advanced bug is indistinguishable from a feature."
-- Rich Kulawiec
.



Relevant Pages

  • Re: Review: My C FAQ Page
    ... the definition of memory leak is good but the information on ... this is not the most efficient way to allocate the 2-d array. ... You have ROW+1 calls to malloc. ... Allocate ROW pointers. ...
    (comp.lang.c)
  • Re: xmalloc string functions
    ... of mallocwhen you are trying to allocate a structure ... caller that fact like I do when malloc() fails to delever me the ... amount of memory I ask it for. ... doesn't have memory to draw stuff it draws. ...
    (comp.lang.c)
  • Re: style question,itoa
    ... our definitions of "memory limited" differ. ... Sorry, you wrote malloc, not memory allocation. ... able to allocate buffers in some convenient location the callee ... calls which can fail, specifically, malloc, need to be checked. ...
    (comp.unix.programmer)
  • Re: xmalloc string functions
    ... malloc() failures will do as a nice stress test, ... pointer, or an allocation failure - you know, the very thing we're ... Why not follow a policy of try and allocate ... memory, if it fails take some application specific recovery action? ...
    (comp.lang.c)
  • Re: style question,itoa
    ... able to allocate buffers in some convenient location the callee ... dynamically allocate a block of memory and computing this size to use ... call might require allocating a page for the stack. ... " Checking every single malloc in a bigger application for possible ...
    (comp.unix.programmer)