Re: Newbie on the lose.. How to add an unknown length dataset to an array

Mark Morss wrote:
Although the "pointer method" is elegant and requires only one array
allocation, it does require enough virtual memory to hold twice the
data. One times the data (plus the pointers) for a linked list, one
times the data for the allocated array. Only later does the list
shrink to nothing. But that is the only disadvantage that I can see.

I don't see a way around this space issue, short of frequent
re-allocations. I would be most curious to know how costly these would
be. I imagine they are fairly costly, which is why I prefer the
"pointer method." When you re-allocate, you're repeatedly switching
and copying between two arrays, aren't you? That seems rather
expensive. Maybe somebody could correct me if the method of repeated
reallocations is actually as effiecient than the linked list method,
but I would be surprised if it were.

A problem I see with the linked list approach is that there may be *more*
allocation than with the reallocation approach. Every time a new datum is added
to the list, you need to allocate a new list node. Whereas with reallocation,
the allocations can be grouped together.

Example: when I'm reading data of unknown size into an array, I reallocate the
array to twice its current size when it fills up. So, for example, reading 1024
values requires 16 allocations (and 16 deallocations) -- 8 for the data array,
and 8 for the temporary array for saving values during reallocation. Whereas,
creating a linked list will require 1024 allocations.



Mdj wrote:
"Mark Morss" Wrote

The most effiecient way to do this, in my view, is first to read the
lines of the data into a linked list. I usually define a type to
receive each record of data to be read. Then I construct a list type
consisting of one of these records plus a pointer. I read the data in
a loop, exiting if endfile, otherwise allocating a new list element and
assigning the just-read data to it. At the same time I count the
number of records read.

Having read all the data into the list, I know the size of the array
that I will need. I allocate it, then work down through the list,
transferring the data elements into the rows of the array and
simultaneously deallocating the list elements.

When I'm done, I have the desired array and nothing else. My
subroutine can return it and let the calling program figure out its
size, but usually I just return an additional integer signifying array

I would be happy to supply an example if this isn't sufficiently clear.
However, if you have Chapman's excellent Fortran 95 for Scientists and
Engineers, Chapter 15 is highly illustrative.
I don't have that book (we used "fortran 95/2003 explained" in our course)
but I have done something like what you describes just with some random
data, but I understand what you mean. The thing is, what is the most
effective? What Arno does reminds me of how large databases assigns space
(MSSQL etc.) and your pointer method makes good sense from since it dosen't
juggle with a potentially larger and larger array. But in I guess that there
isn't a really golden way to do it then? well perhaps I'm just doing a bit
to much of a fuss over something that potentially only takes a fraction of
the total computing time anyway :)

Thanks for your answers :)