Re: Simple C containers, std::vector analog
From: Ross A. Finlayson (raf_at_tiki-lounge.com)
Date: 03/18/05
- Next message: Kobu: "Re: Do input functions use fgetc inside them or read()"
- Previous message: CBFalconer: "Re: fgets() and embedded null characters"
- In reply to: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Next in thread: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Reply: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 17 Mar 2005 20:18:39 -0800
I get to thinking about how to implement a vector. Currently I just
have a bunch of macro functions that represent a subset of C++'s
std::vector implementation.
Basically, I don't need a vector, just a growable array with the
ability to add elements to the back, and a forward iterator from the
front.
As part of its being growable, basically I want to minimize
reallocation. Also, and this is platform specific but a general
concern, I don't want to have fragmented bits in memory, but instead
blocks of memory naturally aligned to the machine and runtime's
allocation method for physical random access memory, RAM.
So I started with a struct:
struct x_vec_t{
size_t size;
size_t capacity
x_vecstate_t vecstate;
x_t* data;
};
I search for "linked list of memory pages" and get to
http://www.gotw.ca/publications/mill14.htm , and basically it seems the
C++ container that is implemented with a linked list of memory pages is
double-ended queue or deque. Mmm... sundaes. Anyways, I think I'm
more interested in an array of memory pages, instead of a linked list,
for a queue. This is where there are generally going to be more
elements in the growable array than the size of a memory page. I want
to avoid reallocation, but I want the pointers to the memory pages in
order so that random access to their pointer values is constant time.
struct x_vec_t{
size_t size;
size_t capacity;
x_vecstate_t vecstate;
memptr_t * pages
x** data;
};
Then, instead of using [] to randomly access a data*, indirection is
used to access the data** for the one data* that the page contains,
using a mask computed from the page size and some shifts.
Then, the user might want the contiguous array out of the growable
array for functions that just work on contiguous arrays, not these
abstract data structures. The data might already be well-organized for
that. If it's not, then a function allocates its own array of the
correct size and then another function copies the elements in order
from the container to the array, from the growable array to the array,
basically forming a snapshot. Alternatively, the growable array can be
marked with a flag that indicates that its x* at data[0] is the
beginning of the contiguous array of size many element of type x, with
a function to reorganize itself by allocating as much memory as
necessary to contain its not-necessarily contiguous memory pages and
copy them over itself, having the contiguous flag set until it reserves
more capacity by allocating another page and adding its offset into the
array of pages, without reallocating all the other data pages.
So then when the structure is initialized, it allocates a memory page
for the page pointer, or sizeof many pages, and a memory page for the
data pointer, or sizeof many or some other amount per the allocation
strategy. On lots of machines these days, the memory page size is
around 4 kilobytes, which means that around 1024 many pages could be
allocated with pointers from the page array before that would have to
be reallocated. On some machines the page size is several megabytes
and instead of getpagesize a constant macro can replace the function,
that as well for platforms where there is no getpagesize function or
other function returning some ideal memory block size.
So, that appears to be replaceable by the definition of a queue instead
of a vector, because there's only addition of elements to the end and
not the beginning. It's kind of like a template, for each type x_t to
be contained there is declared struct x_vec_t, or x_array_t.
Let me see if I can recount these requirements: I want basically an
array, that is growable as painlessly as possible. The characteristics
of the array is that it might have only a few thousand elements or
several millions or billions of elements. Thus it should avoid
reallocation of the data array as possible because the eventual count
of elements of the growable array is unknown at the outset. It should
be readily compatible with functions that expect fixed arrays. It
should have constant time access to elements and amortized constant
time growing.
There you go.
Warm regards,
Ross F.
- Next message: Kobu: "Re: Do input functions use fgetc inside them or read()"
- Previous message: CBFalconer: "Re: fgets() and embedded null characters"
- In reply to: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Next in thread: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Reply: Ross A. Finlayson: "Re: Simple C containers, std::vector analog"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|