Re: problem reading/writing structures from and to files



Chris Torek wrote:
In article <1160153264.642499.301250@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
<arne.muller@xxxxxxxxx> wrote:
I've come across some problems reading strucutres from binary files.

As others have cautioned, it is often wise to use something other
than "raw binary" format for data files. Problems that were
guaranteed to run on a single machine seem often to expand as
if by magic and suddenly require a heterogenous network. :-)

That said:

typedef struct {
int i;
double x;
int n;
double *mz;
short *intens;
} Data;

I've an array of these structures and their mz and intens pointers
point to arrays with n elements each.

My progam can write the Data array into a binary file, after writing
the structure itself (using fwrite) it fwrites (appends) the mz and
then the itnens arrays.

In other words, you use fwrite() to write out the i, x, and n
fields (which you need in the file) plus also the "mz" and "intens"
fields (which you do *not* need in the file, since they have to
be replaced on subsequent "re-load-from-file" runs):

Data *p;
FILE *somefile;
... set up p, p->i, p->x, p->n, etc ...

somefile = fopen("somename", "wb");
if (somefile == NULL)
... handle error ...

/* possible additional code here */

if (fwrite(p, sizeof *p, 1, somefile) != 1)
... handle error ...
if (fwrite(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fwrite(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

This code is OK, although the initial fwrite() -- writing bytes
from (void *)p for length sizeof *p -- writes three useful values
and two useless ones. It would be "better" (in some sense) to
write only the useful values, by replacing the first fwrite()
with three separate fwrite()s:

if (fwrite(&p->i, sizeof p->i, 1, somefile) != 1 ||
fwrite(&p->x, sizeof p->x, 1, somefile) != 1 ||
fwrite(&p->n, sizeof p->n, 1, somefile) != 1)
... handle error ...

My idea is to read this file in one go into memory using fread ...

The simplest way to read it back is to use as many fread()s as
fwrite()s above:

Data *p;
FILE *somefile;
...
p = malloc(sizeof *p);
if (p == NULL)
... handle error ...
somefile = fopen("somename", "rb");
if (somefile == NULL)
... handle error ...

/* assuming three separate fwrite()s for the useful elements: */
if (fread(&p->i, sizeof p->i, 1, somefile) != 1 ||
fread(&p->x, sizeof p->x, 1, somefile) != 1 ||
fread(&p->n, sizeof p->n, 1, somefile) != 1)

/* insert range-checking on i, x, and p here if desired,
to validate the input data */

if ((p->mz = malloc(p->n * sizeof *p->mz)) == NULL)
... handle error ...
if ((p->intens = malloc(p->n * sizeof *p->intens)) == NULL)
... handle error ...
if (fread(p->mz, sizeof *p->mz, p->n, somefile) != p->n)
... handle error ...
if (fread(p->intens, sizeof *p->intens, p->n, somefile) != p->n)
... handle error ...

(I could even use mmap, since this file is accessed by several
processes read only),

The mmap() routines are dangerously seductive. Using them ties
your code and data to OS- and machine-dependent items, and makes
it error-prone in ways that are not always obvious on first blush.
(For instance, the really odd one is what happens if the file is
truncated after successfully mapping it.)

and then to reconstruct the mz and intens pointers properly.

If you omit them when writing, you can omit them when reading
back, as above.

While mmap() avoids what some people call "unnecessary" copying of
the data (during I/O), that very copying is what makes the code
above so simple and reliable. Often, the simplicity is worth the
performance penalty. (If it is not, one can always complextify
the code later. :-) )

Combining C structures and data and writing it to a file, then reading that file back into memory, is non-trivial.

I invite you all to examine the .DBF file structure of dBASE or FoxPro or Clipper. The file consists of a binary header (certainly a C-like structure) describing record length, number of rows and such. Then another array of structures describing the attributes of each column in a row. The remainder of the .dbf file is text which begins at an offset defined in the header and continues for cols * rows bytes, ending with the ever-popular 0x1A byte.

I have worked with this structure for more than 20 years now. I like it. Ten years ago I began writing C programs to manipulate .dbf files. Doable of course but not 'simple' by any means.

Attempts to write structures and data to a single file and then read the file and data in a meaningful way will prove to be non-trivial.

Simpler is better. Define your data in terms of columns per row and rows per file, and write it in text, not binary. Beware the Endians.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
.



Relevant Pages

  • Re: problem reading/writing structures from and to files
    ... My progam can write the Data array into a binary file, after writing ... In other words, you use fwrite() to write out the i, x, and n ... FILE *somefile; ...
    (comp.lang.c)
  • Re: Problem with a script
    ... a loop there becomes impractical. ... You still have them as uniquely named array indexes... ... writing the code twice will only ... reading your entire code and parsing it in their head, ...
    (comp.lang.php)
  • Re: Problem with a script
    ... Okay, so variables have unique labels, that doesn't mean they still couldn't be handled in a loop. ... You still have them as uniquely named array indexes... ... I believe that for the new guy this code would be readable, and identifying problems should really not be any more difficult with this, plus I think that it actually might save some time to write the actual code from the beginnig, even though it's not at it's final stage, instead of first writing everything spread out, and then rewriting the same code again cleaned. ... If you expect a person to spend an hour reading your entire code and parsing it in their head, you wont get any help and have to solve the problem by yourself. ...
    (comp.lang.php)
  • Re: Read and re-write file with one open?
    ... This opens a file for reading and writing, reads the file into an array, ... altered data back to the file, and the file contains half altered data ...
    (comp.lang.ruby)
  • Re: Streams
    ... I'm writing from an Array directly to a Flash Drive. ... So, by writing the array to a stream, compressing from the stream straight ...
    (alt.comp.lang.borland-delphi)