Re: Memory management strategy



Paul Mesken <usurper@xxxxxxxxxx> writes:
> On 30 May 2005 16:25:43 GMT, roberson@xxxxxxxxxxxxxxxxxx (Walter
> Roberson) wrote:
[...]
>>Inlining of short routines can -sometimes- take less memory than
>>calling the routine, as the registers can sometimes be used directly
>>"as-is", without having to save important values and generate the
>>call and have the return stack.
>
> The use of registers instead of the stack doesn't need inlining. There
> are often calling conventions that can control this behaviour (of
> course, such things are depending on the capabilities of the compiler
> and linker).

A function call, depending on the calling conventions, might require
the use of some registers; inlining the call might make those
registers available for other purposes (like caching local variables).
This is, of course, extremely system-specific.

> "Inline functions" are more liable to use more memory since the whole
> function is put in place of the code that calls such a function. If
> some inline function is called at 20 different places in the program
> then the code of that function is placed 20 times in the code. If the
> function is big then this might give a quite substantial code size
> penalty. The fact that the function takes a little bit less code
> because it doesn't need to do stack manipulations (or less stack
> manipulations) doesn't amount to a lot when the code of that function
> is copied 20 times in the program.

On the other hand, inlining a function can provide more opportunities
for optimization. For example:

void foo(int x)
{
if (x < 0) {
/* some stuff */
}
else if (x == 0) {
/* some other stuff */
}
else {
/* yet other stuff */
}
}

If a call to foo() is inlined in a context where the compiler knows
that x is positive, the "some stuff" and "some other stuff", along
with the tests, can be eliminated. Add to that the savings from
eliminating the entry and exit code, and you just might get less code
with 20 inlined copies than with 20 calls to a single function. Or
you might not.

[...]

> Yes, it's only sensible to use packed values when the extra amount of
> code necessary to manipulate such values is less than the amount of
> memory that was saved by having such packed values.

For single variables, it rarely makes sense to use a smaller type.
For arrays, you can often save space by using an array of char or
short rather than int or long, or an array of float rather than double
-- but of course you need to make sure that the values will actually
fit in the smaller type.

One more possible tip: In embedded systems, my understanding is that
code and constant data are typically in ROM, while variable data is
typically in RAM, and that ROM is often much more plentiful than RAM.
It might make sense to write a large amount of code, or declare a
large constant array, to save a little bit of run-time data.

And of course *all* this stuff is extremely system-specific. None of
this advice should be followed blindly. Understand the system you're
using, and *measure*.

I'm sure the folks in comp.arch.embedded know more about this stuff
than I do.

--
Keith Thompson (The_Other_Keith) kst-u@xxxxxxx <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
.



Relevant Pages

  • Re: C Compilation..
    ... push the parameters right to left, and the callee adjusts the stack on return. ... When calling C, do the same, except the caller adjusts the stack. ... Oh, and when Windows calls /your/ functions, you need to save most of the registers. ...
    (comp.lang.c)
  • Re: structure components of functions?
    ... arguments, and return an array and/or structure, which the calling ... procedure will automagically allocate on the stack before the call. ... The problem is that the array bounds will not be known before the function ...
    (comp.lang.pl1)
  • Re: 2.6.27-rc2-git5 BUG: unable to handle kernel paging request
    ... there's any more info to be gleaned from the registers above. ... PERCPU: Allocating 46648 bytes of per cpu data ... initcall migration_init+0x0/0x5b returned with error code 1 ... calling spawn_ksoftirqd+0x0/0x58 ...
    (Linux-Kernel)
  • Re: Statement on Schildt submitted to wikipedia today
    ... you discover that Schildt was libeled. ... It turns out that, on this particular architecture, some registers are saved ... The problem is, even though indeed this implementation has "a stack", if you ... end up adjacent to one of our newest automatic variables. ...
    (comp.lang.c.moderated)
  • Re: Geriatric Pentium
    ... processor scavenging spare ram cycles to back it up to ram and restore ... Let's say I had hardware for 128 registers. ... Ditto for caching the stack. ... PUSHL P+2;; take parameter #2 and push it ...
    (comp.lang.java.advocacy)