Re: low-level pointer vs. array question



copx wrote On 04/25/07 12:00,:
"Mike Wahler" <mkwahler@xxxxxxxxxxxx> schrieb im Newsbeitrag
news:zJJXh.8284$3P3.689@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

"copx" <copx@xxxxxxxxx> wrote in message
news:f0no2f$1b6$1@xxxxxxxxxxxxxxxxxx

Unforuntately, I know next to nothing about ASM and compiler
construction, and while I was aware of the syntactic differences between
pointers and arrays, I was not aware of this:

http://udrepper.livejournal.com/13851.html
(Yes, it is livejournal, but it is the journal of the glibc maintainer)

I don't have time to investigate that article right now,
so I can't comment on its validity. I can recommend this
one: http://pweb.netcom.com/~tjensen/ptr/pointers.htm
Also Google for Chris Torek's article "C for smarties".
IMO Mr. Torek is deservedly considered one of comp.lang.c's
'resident gurus'. I have learned a considerable amount about
C from reading his posts.

[snip]

Thanks, Chapter 6 answered my question, because Mr. Torek did not shy away
from explaining what the statements actually translate to. So now I know
that the array version is indeed superiour for my purposes. It allocates
memory only for the strings while the pointer version additionally allocates
memory for pointer variables. Also accessing an individual string is faster,
because in the array version foo_array[SOME_INDEX] translates to a constant
memory address, while in the pointer version the program has to load a
pointer variable to get the address of the string.

Note that "it allocates memory only for the strings"
is not necessarily the case. The context, you recall, is

const char foo_array[N_ENTRIES][ENTRY_SIZE] = {
"foo",
"bar"
};

versus

const char * foo_array[N_ENTRIES] = {
"foo",
"bar"
};

.... so "only for the strings" holds only if ENTRY_SIZE==4.
If your strings are of various lengths:

const char states[50][ENTRY_SIZE] = {
"Alabama",
"Alaska",
"Arizona",
...
"West Virginia",
"Wisconsin",
"Wyoming",
};

.... you will need to make ENTRY_SIZE at least one greater
than the length of the longest string, hence at least 14
(three U.S. states have thirteen-letter names). The other
forty-seven array elements will hold strings plus extra
zero bytes: "Iowa" and "Ohio" are five-byte strings, but
each will be accompanied by nine extra zeroes. All told,
you will use 50 * ENTRY_SIZE >= 700 bytes on the two-
dimensional array.

How about the array of pointers? The fifty state names
and their '\0' terminators amount to 467 bytes, and the
pointers will take another 50 * sizeof(char*) bytes. If
sizeof(char*) == 4 (as on most 32-bit machines), the total
will be 667 bytes. That's at least 33 bytes *less* than
the "superiour" two-dimensional array! The "only for the
strings" solution actually uses five percent *more* memory
than strings-and-pointers!

Finally, the remarks about access speed must be taken
with a grain of salt, or perhaps an entire heap of salt.
For one thing, conclusions of this sort are highly context-
dependent: optimizing compilers will play all manner of
strange games to exploit patterns in the code, especially
in and around loops. That pointer load you're so worried
about may be completely free, thanks to a prefetch issued
on the preceding loop iteration. Another point is that
you should estimate the potential savings (even if you
can actually achieve them, which isn't certain) in light
of whatever you're going to do with the string once you
know where it is. In a context like

printf ("The states of the U.S.A. are:\n");
for (i = 0; i < 50; ++i)
printf ("\t%s\n", states[i]);

.... saving or wasting one memory reference per array access
will make no difference large enough to measure, even with
very sensitive and expensive test equipment. You might
improve your car's fuel economy by driving naked so the
vehicle needn't carry the weight of your clothing, but the
improvement seems hardly worth while.

--
Eric.Sosman@xxxxxxx
.



Relevant Pages

  • Re: HardBound and SoftBound (was "The State of Software")
    ... completely re-widen, to all of memory, and hence lose all protection. ... e.g. unless you know that malloc'ing is being done out of a common array ... I agree with Nick - SoftBound will fail to detect many common ... interpreted as a pointer across different architectures and memory models... ...
    (comp.arch)
  • Re: Problem with large arrays
    ... >am trying to using an array of signals that is just slightly larger) for the ... location of this very large memory. ... so that each pointer points to one row. ... row data structure and make the pointer point to it. ...
    (comp.lang.vhdl)
  • Re: gdb not catching out-of-bounds pointer
    ... that, for example, accesses one array from a pointer to another is ... provided the library writer knows what the compiler writer guarantees ... The portability of an allocator depends on the source of raw memory. ...
    (comp.unix.programmer)
  • Re: Out-of-bounds nonsense
    ... 6.5.6p8 of the C standard says about C pointer arithmetic. ... moving throught that array. ... The wording used in both standards makes sense only if the relevent ... pointer arithmetic within 'malloc'ed memory blocks (which naturally have no ...
    (comp.std.c)
  • Re: two dimensional arrays passed to functions
    ... > array and then send it down as a single dimmensional array. ... x is an array of char pointers. ... to a pointer to the first element of this array, ... need to copy the strings and not just assign pointers to them). ...
    (comp.lang.c)