Re: A challenging file to parse



I understand that this is a C group and solutions tend towards making
use of C language features/libraries. However, I viewed it as an
algorithm problem and found a O(N) run-time and O(N) memory solution,
where n is the total numbers in the file.

Assumptions: the numbers are in a matrix like structure with R rows
and C columns. I realized later that this is too severe a restriction
for your case.

Summary: When coding in C, we parse from left-to-right and top-to-
bottom. Do the same thing, but when storing in an array, store it at
locations that appear as if we have read top-to-bottom and left-to-
right [array storing is an issue in your case, but anyway, here is my
solution]

To explain better, lets assume
R = Rows in the matrix
C = Columns in the matrix
N = R * C = total numbers
SCAN[N] = Array that is storing all these numbers
i = row index
j = column index
p = j*R + i; index in to array SCAN[]

So, when reading the numbers from left-to-right [as in C], store them
at index 'p'. In O(N) you will read complete file, store it and print
it on screen.

Anyway, this solution is too restrictive.


On Aug 21, 1:07 pm, david.de...@xxxxxxxxx wrote:
I have a group of files in a format that is that is tab delimited with
about a million columns and a thousand rows.

Reading this file left-to-right top-to-bottom is not a problem but my
requirements are to read it top-to-bottom left-to-right (to read each
column in order as follows).

1,4,7
2,5,8
3,6,9

It's an O(n^2) problem if I read each line for each column (it could
take a week for a big file). The file is too big to hold the lines in
memory and I see no strategy where I can hold a subset of lines in
memory.

This seems like a simple problem but I have struggled with lots of
solutions that have come up short.

Any suggestions would be appreciated.

Thank you.


.



Relevant Pages

  • Re: A challenging file to parse
    ... Do the same thing, but when storing in an array, store it at ... Also the data format. ...
    (comp.lang.c)
  • Re: Storing large number of values in 2D array
    ... array.How can I store them and retrieve them later? ... around 32 000 bytes for the 2D array. ... But that was a memory efficient solution, ... I have seen proposals to add to C ...
    (comp.lang.c)
  • Re: Database in memory
    ... > I have a large amount of information which I store in memory. ... > this information as array of structures using CPtrArray. ... > several tables completely in memory and query it with SQL? ...
    (microsoft.public.vc.mfc)
  • Database in memory
    ... I have a large amount of information which I store in memory. ... this information as array of structures using CPtrArray. ... several tables completely in memory and query it with SQL? ...
    (microsoft.public.vc.mfc)
  • Database in memory
    ... I have a large amount of information which I store in memory. ... this information as array of structures using CPtrArray. ... several tables completely in memory and query it with SQL? ...
    (microsoft.public.vc.mfc)